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(57) Abstract 

This invention discloses a method for generating a recombinant library by introducing one Or more changes within a prede- 
termined region of double-stranded nucleic acid, comprising providing a first primer population and a second primer population, 
each of the populations having a variable base composition at known positions along the primers, the primers incorporating a 
class IIS restriction enzyme recognition sequence, being capable of directing change in the nucleic acid sequence and being sub- 
stantially complementary to the double-stranded nucleic acid to permit hybridization thereto. The method additionally comprises 
hybridizing the first and second primer populations to opposite strands of the double-stranded nucleic acid to form a first pair t 
primer-templates oriented in opposite directions, performing enzymatic inverse polymerase chain reaction to generate at least one 
linear copy of the double stranded nucleic acid incorporating the change directed by the primers, cutting the double-stranded 
nucleic add copy with a dass US restriction enzyme to form a restricted linear nudeic add molecule containing .the change, join- 
ing termini of the restricted linear nudeic add molecule to produce double-stranded circular nudeic acid and introducing the 
nudeic add into compatible host cells. A method is additionally provided for generating a recombinant library using wobble-base 
mutagenesis. 
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BNZYMATIC INVERSE POLYMERASE CHAIN REACTION LIBRARY 

MUTAGENESIS 



parraROUND OF THE INVENTION 

10 Recombinant DNA techniques have revolutionized molecular 

biology and genetics by permitting the isolation and 
characterization of specific DNA fragments. Of major impact 
has been the exponential amplification of small amounts of DNA 
by a technique known as the polymerase chain reaction (PCR) . 

15 The sensitivity, speed and versatility of PCR makes this 
technique amenable to a wide variety of applications such as 
medical diagnostics, human genetics, forensic science and 
other disciplines of the biological sciences. 

PCR is based on the enzymatic amplification of a DNA 

20 sequence that is flanked by two oligonucleotide primers which 
hybridize to opposite strands of the target sequence. The 
primers are oriented in opposite directions with their 3 ' ends 
pointing towards each other. Repeated cycles of heat 
denaturation of the template, annealing of the primers to 

25 their complementary sequences and extension of the annealed 
primers with a DNA polymerase result in the amplification of 
the segment defined by the 5' ends of the PCR primers. Since 
the extension product of each primer can serve as a template 
for the other primer, each cycle results in the exponential 

30 accumulation, of the specific target fragment, up to several 
million fold in a few hours. The method can be used with a 
complex template such as genomic DNA and can amplify a single- 
copy gene contained therein. It is also capable of amplifying 
a single molecule of target DNA in a compl x mixture of RNAs 
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or DHAs and can, under some conditions, produce fragments up 
to ten kb long. The PCR technology is the subject matter of 
United States Patent Nos. 4,683,195, 4,800,159, 4,754,065, and 
4,683,202 all of which are incorporated herein by reference. 
5 in addition to the use of PCR for amplifying target 

sequences, this method has also been used to generate site- 
specific mutations in known sequences. Mutations are created 
by introducing mismatches into the oligonucleotide primers 
used in the PCR amplification. The oligonucleotides, with 

10 their mutant sequences, are then incorporated at both ends of 
the linear PCR product. In addition to their mutated 
sequences, the primers often contain restriction enzyme 
recognition sequences which are used for subcloning the 
mutated linear DNAs into vectors in place of the wild type 

15 sequences. Although this procedure is relatively simple to 
perform, its applications are limited because appropriate 
restriction sequences are not always conveniently located for 
substituting the mutant sequence with the wild-type sequence. 
Restriction sequences can be incorporated into the wild-type 

20 sequences for subcloning. However, such extraneous sequences 
can cause detrimental effects to the function of the gene or 
resulting gene product. Moreover, PCR products typically 
contain heterogeneous termini resulting from the addition of 
extra nucleotides and/or incomplete extension of the primer- 

25 templates. Such termini are extremely difficult to ligate and 
therefore result in a low subcloning efficiency. 

Several modifications of the PCR-based site-directed 
mutagenesis strategies have been developed to circumvent such 
limitations, but they too have undesirable features. The most 

30 prominent undesirable feature exhibited by these alternative 
methods is a low frequency of correct mutations. For example, 
inverse PCR (IPCR) is a method which amplifies a circular 
plasmid rather than a linear molecule, Hemsley et al. , Nuc. 
Acid. Res. 17:6545-6551 (1989), which is incorporated herein 

35 by reference. In this technique, two primers which are 
located back to back on opposing DNA strands of a plasmid 
drive the PCR reaction. The resultant PCR product, a linear 
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DNA molecule identical in length to the starting plasmid, 
contains any mutations which were d sign d int the primers. 
The product is then enzymatically prepared for ligation by 
blunting and phosphorylating the termini. Enzymatic treatment 
5 of the termini is a necessary step for ligation due to 
heterogeneous termini associated with PCR products. These 
treatments are likely to be incomplete and cause unwanted 
mutations as well as result in a low ligation and 
transformation efficiency due to the additional required 
10 steps. 

Recombinant circle PCR (RCPCR) , Jones and Howard , 
BioTechniques 8:178-183 (1990), and recombination PCR (RPCR) , 
Jones and Howard, BioTechniques 10:62-65 (1991), on the other 
hand, are two methods similar to IPCR which do not require any 

15 enzymatic treatment. In RCPCR, two separate PCR reactions, 
requiring a total of four primers, are needed to generate the 
mutated product. The separate amplification reactions are 
primed at different locations on the same template to generate 
products that when combined, denatured and cross-annealed, 

20 form double-stranded DNA with complementary single-strand 
ends. The complementary ends anneal to form DNA circles 
suitable for transformation into E. coli. 

RPCR is a technique that uses PCR primers having a twelve 
base exact match at their 5' ends, resulting in a PCR product 

25 with homologous double-stranded termini. Transformation of 
the linear product into recombination-positive (recA-positive) 
cells produces a circular plasmid through in vivo 
recombination. Although this method reduces the number of 
steps and primers used compared to RCPCR, the transformation 

30 and recombination of linear molecules is an inefficient 
•process resulting in a correspondingly low mutation frequency. 

A modification of site-directed mutagenesis, random 
mutagenesis, permits the incorporation of random mutations 
into a polynucleotide. Mutant libraries are normally 

35 constructed by the mutagenesis of a small, defined ar a of a 
plasmid containing the gene or control regi n of inter st. 
Methods for generating mutant libraries typically use 
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synthetic oligonucleotides with random or biased mixtures of 
bases in one or more positions along the oligonucle tide. A 
variety of methods have been used to introduce these mutagenic 
oligonucleotides into the expression vector. Typically, the 
5 oligonucleotides are hybridized to a substantially 
complementary strand of DNA and a polymerase is used to extend 
the length of the oligonucleotide into a polynucleotide whose 
length is dependant both on the length of the template and on 
the conditions of enzymatic extension. This procedure permits 

10 the construction of large libraries of mutants having 
mutations in one or more regions of the polynucleotide or 
protein sequence as compared with the template. From these 
libraries, the transfectants or transformants can be screened 
for the desired characteristic. However, both random 

15 mutagenesis employing PCR, and random mutagenesis, in general, 
are restricted in design by the choice of restriction 
endonucleases traditionally employed for these procedures 
Often random mutagenesis has a relatively low efficiency such 
that a significant number of individual mutations are lost 

20 during primer extension and introduction of the polynucleotide 
into the host. Further, mistakes or unintended mutations are 
often incorporated into the sequences resulting in an 
additional decrease in the efficiency. Selected mutations may 
therefore be under or overrepresented in the library. 

25 Thus, a need exists for a PCR-based mutagenesis method 

which allows the rapid and efficient alteration of nucleotide 
sequences to create libraries that are sufficiently diverse. 
The present invention satisfies this need and provides related 
advantages as well. 

30 

^ HPSCRIPTTON OF THE DRAWINGS 

Figure 1 is a schematic diagram outlining the steps of 
EIPCR. 

35 Figure 2 shows the design of EIPCR primers. Line A shows 

a region of the PCR template (SEQ ID NO: 1) and two mutations 
to be made by EIPCR (indicated by small arrows) . Line B shows 
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how the primers (SEQ ID NO: 2; SEQ ID NO: 3) r late to th 
mutated product (line C) (SEQ ID NO: 4). This is not an 
actual reaction intermediate, but is a cartoon to draw when 
designing the primers. The primers are indicated in grey. 
5 The Bsa I recognition sequence 'SEQ ID NO: 5) is underlined. 
Four or more bases are added 5' to the enzyme recognition 
seguence of each primer to ensure efficient substrate 
recognition by the enzyme. Line C shows the seguence of the 
mutated product. The grey boxes show the parts of the primer 

10 that have been incorporated into the final product. The 
overhangs of the two DNA ends are indicated, but the 
recognition sequences have been cut off and are not part of 
the final product. 

Figure 3 is a list of class IIS restriction enzymes and 

15 the nucleotide sequence of their recognition sequences (SEQ ID 
NOS: 5 through 20). 

Figure 4 is a schematic diagram showing the use of EIPCR 
technology for generating single chain antibodies. Line A 
shows the template region (SEQ ID NO: 21) to be mutagen i zed to 

20 create a linker between heavy and light chain encoding 

sequences. Line B shows the EIPCR primer design (SEQ ID NO: 
22; SEQ ID NO: 23) and line C shows the nucleotide (SEQ ID NO: 
24) and amino acid (SEQ ID NO: 25) sequence of an identified, 
active single chain antibody sequence. 

25 Figure 5 is a schematic of the 1.8 kb expression vector 

pMCHAFvl for CHA255 Fv fragment expression. The expression 
cassette is located between Hind III and Eco' Rl restriction 
endonuclease sequences in pUC19. 

Figure 6 is a schematic of EIPCR primer design. Line A 

30 shows the area of the wildtype leader sequence that was 
replaced by a library of leader sequences. Line B shows the 
design of the mutagenic primers relative to the template (SEQ 
ID NO: 26 and SEQ ID NO: 27) . Line C shows the sequence of 
the identified, positive single chain Fv linker conferring 

35 increased protein expression that was obtained from the random 

library (SEQ ID NO: 28). 

Figure 7 is a schematic illustrating EIPCR promoter 
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library mutagenesis. Figure 7A is the template sequence. The 
underlined regions in Figure 7B indicate the regions f 
variability in the library. 

^ITMMARY QV THE INV ENTION 

The invention is directed to a method for generating a 
recombinant mutagenesis library by introducing one or more 
changes within a predetermined region of double stranded 
nucleic acid, comprising providing a first primer population 
and a second primer population, each population having a 
variable base composition at known positions along the 
primers, the primers incorporating a class IIS restriction 
enzyme recognition sequence, being capable of directing change 
in the nucleic acid sequence and being substantially 
complementary to the double-stranded nucleic acid to allow 
15 hybridization thereto. The method also comprises hybridizing 
the first and second primer populations to opposite strands of 
the double-stranded nucleic acid to form a first pair of 
primer-templates oriented in opposite directions, performing 
enzymatic inverse polymerase chain reaction to generate at 
20 least one linear copy of the double stranded nucleic acid 

incorporating the change directed by the primer, cutting the 
double stranded nucleic acid copy with a class IIS restriction 
enzyme to form a restricted linear nucleic acid molecule 
containing the change and introducing nucleic generated 
25 therefrom into compatible host cells. 

In a preferred embodiment, the method additionally 
comprises the step of joining termini of the restricted linear 
nucleic acid molecule to produce double stranded circular 
nucleic acid. The method preferably produces restricted 
30 linear nucleic acid molecules containing only the directed 
change in the nucleic acid sequence. Preferably the double 
' stranded nucleic acid is circular DNA. The method can be 
performed on either eukaryotic or prokaryotic cells. 

In a preferred embodiment of the invention, the double 
35 stranded nucleic acid encod s polypeptide. The change in the 
nucleic acid can be introduced into the amino acid coding 
region of the polypeptide or into a regulatory region of the 
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polypeptide. Thus changes may be introduced into promoter and 
enhanc r regions of the double stranded nucl ic acid. Th 
polypeptide encoded by the double stranded nucleic acid is 
preferably expressed from the host cells* 
5 In another preferred embodiment of the invention, the 

double stranded nucleic acid comprises a viral vector and 
compatible host cells comprise a helper virus packaging cell 
line that directs the packaging of viral particles containing 
the viral vector. The viral particles are preferably 

10 collected and the method additionally comprises the step of 
infecting susceptible cells with the viral particles. 

In yet another preferred embodiment of the invention , a 
method is provided for improving polypeptide expression from 
a double-stranded nucleic acid sequence encoding polypeptide 

15 comprising: measuring polypeptide expression from the double 

stranded nucleic acid in a compatible host cell, providing a 
first primer population and a second primer population, each 
of the populations having a variable base composition at known 
positions along the primers, the primers incorporating a class 

20 IIS restriction enzyme recognition sequence, being capable of 
directing change in the nucleic acid sequence and being 
substantially complementary to t he double stranded nucleic 
acid to allow hybridization thereto. The method additionally 
comprises hybridizing the first and second primer population 

25 to opposite strands of the double stranded nucleic acid to 
form a first pair of primer-templates orientated in opposite 
directions , performing enzymatic inverse polymerase chain 
reaction to generate at least one linear copy of the double 
stranded nucleic acid incorporating the change directed by the 

30 primers, cutting the double stranded nucleic acid copy with a 
class IIS restriction enzyme to form a restricted linear 
nucleic acid molecule containing the change, introducing the 
nucleic acid from the cutting step or the PCR step into host 
cells and measuring polypeptide expression from the modified 

35 nucleic acid in the cells, and identifying cells with 
expression levels greater than the expression levels measured 
in cells containing unmodified double stranded nucleic acid. 
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The method preferably additi nelly comprises the step of 
Minima termL of the restricts Uneer nucleic acid molecule 
L proluce modified double .trended circular nucleic acid end . 
the method also preferehly comprieee the step of obtaxnxn, 
m odified template fro. eelected cells. Preferably the ^ 
^ Led nuciic acid sconce is identified and transferred 
into another nucleic acid sequence. The prxmers can dxrect 
changes in a regulatory sequence, including promoters, or the 
primers can direct changes in e polypeptide sequence In a 
preferred embodiment the primers direct changes xn a rxbosome 

binding sequence. 

in yet another preferred embodiment of thxs xnventa.cn, e 
method is provided for generating a recombinant library usxng 
"ooble-base mutagenesis comprising: providing a first primer 
population and a second primer population, said prxmers bexng 
substantially complementary to a region of double stranded 
nucleic acid encoding polypeptide to allow bybridxzation 
thereto, the primer, having a variable base composxtxon xn the 
third position of a least one nucleotide ccdon corresponding 
20 to the double stranded nucleic acid and a class IIS 
restriction ensyme recognition sequence. The ■ method 
additionally comprises hybridising the first and second prxmer 
populations to opposite strands of the double stranded nuclexc 
acid to form a first pair of primer-templates orxentated xn 
opposite directions, performing ensymatic inverse polymerase 
onain reaction to generate at least one linear copy of the 
double strended nucleic acid incorporating the change dxrected 
ny the primers, cutting the double stranded lineer nuclexc 
acid with a class IIS restriction enzyme to form restrxcted 
linear nucleic acid molecule containing the change and 
introducing nucleic acid genereted therefrom into compatxW 
host cells. The variable base codons preferably do not alter 
the corresponding animo acid sequence of the polypeptide. 

I» a pr f erred embodiment the primers direct alterations 
in the leader sequence of the polypeptide. The leeder 
sequence is preferably the bacterial OmpA protein leader 
sequence of e fregment thereof and the leader sequ nee xs 
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preferably linked to polynucleotide encoding light and heavy 
chain antibody fragments. 

nKTATLED DES CRIPTION OF THE INVENTION 

The invention provides a novel method for rapid and 
5 efficient site directed mutagenesis of double-stranded linear 
or circular DNA. The method, termed Enzymatic Inverse 
Polymerase Chain Reaction (EIPCR) , greatly improves the 
utility of previous PCR techniques enabling rapid screening or 
selection of putative mutant to identify clones containing 

10 changes of interest. 

In one embodiment, oligonucleotide primers containing the 
desired sequence changes are used to direct PCR synthesis of 
a double-stranded circular DNA template (Figure 1) . The 
primers are designed so that they additionally contain a class 

15 IIS restriction enzyme recognition sequence and a sequence 
complementary to the template for primer hybridization. The 
primers are hybridized to opposite strands of the circular 
template and direct the amplification of each strand to form 
linear molecules containing the desired mutations. The ends 

20 of the linear molecules are filled in with Klenow polymerase 
or T4 DNA polymerase and restricted with the appropriate class 
IIS restriction enzyme to produce compatible overhangs for 
circularization and ligation. 

EIPCR uses class IIS restriction enzyme recognition 

25 sequences in the mutated or non-mutated PCR primers. This 
type of recognition sequence is used because the cleavage site 
is separated from the recognition sequence and therefore does 
not introduce extraneous sequences into the final product. 
Restriction of the PCR products with a class IIS enzyme 

30 removes the recognition sequence and produces homogeneous 
termini for subsequent ligation. Class IIS recognition 
sequences therefore circumvent problems associated with 
ligating heterogeneous PCR termini since such termini will b 
cleaved off using a class IIS recognition enzyme. If the 

35 primers are designed with complementary cleavage sites, th 
resulting termini will have complementary overhangs which can 
be used for circularization of the linear molecules. Such 
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complementary overhangs increas the efficiency of 
intramolecular ligati n compared t blunt ends and result in 
a high percentage of correctly mutated clones. Thus, EIPCR 
allows efficient mutagenesis and production of homogeneous 
termini of any DNA template without incorporating extraneous 
sequences. EIPCR also allows mutagenesis at any location 
within a circular template independent of convenient 
restriction sequences. 

As used herein, the term "predetermined change- refers to 
a specific desired change within a known nucleic acid 
sequence. Such desired changes are commonly referred to in 
the art as site directed mutagenesis and include, for example, 
additions, substitutions and deletions of base pairs. A 
specific example of a base pair change is the conversion of 
the first A/T bp in the sequence AGCA to a G/C bp to yield the 
sequence GGCA. It is understood that when referring to a base 
pair, only one strand of a double-stranded sequence, or one 
nucleotide of a base pair need be used to designate the 
referenced base pair change since one skilled in the art will 
know the corresponding complementary sequence or nucleotide. 

As used herein, the term "class IIS restriction enzyme 
recognition sequence" refers to the recognition sequence of 
class IIS restriction enzymes. Class IIS enzymes cleave 
double-stranded DNA at precise distances from their 
recognition sequence. The recognition sequence is generally 
about four to six nucleotides in length and directs cleavage 
of the DNA downstream from the recognition sequence. The 
distance between the recognition sequence and the cleavage 
site as well as the resulting termini generated in the 
restricted product vary depending on the particular enzyme 
used. For example, the cleavage site can be anywhere from one 
to many nucleotides downstream from the 3 • most nucleotide of 
the recognition sequence and can result in either blunt cuts 
or 5- and 3- staggered cuts of variable length. Such 
staggered cuts produce termini having singl -stranded 
overhangs. Therefore, "complementary cleavage sites" as used 
herein refers to complementary nucleic acid sequences at such 
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single-stranded overhangs. Class IIS restriction enzyme 
recognition sequences suitable for use in the invention can 
be, for example, Alw I, Bsa I, Bbs I, Bbu I, Bsm AI, Bsr I, 
Bsm I, BspM I, Ear I, Esp 31, Fok I, Hga I, Hph I, Mbo II, Pie 
5 I, SfaN I, and Mnl I. It is understood that the recognition 
sequence of any enzyme that utilizes this separation between 
the recognition sequence and the cleavage site is included 
within this definition. 

As used herein, the term "substantially complementary" 

10 refers to a nucleotide sequence capable of specifically 
hybridizing to a complementary sequence under conditions known 
to one skilled in the art. For example, specific 
hybridization of short complementary sequences will occur 
rapidly under stringent conditions if there are no mismatches 

15 between the two sequences. If mismatches exist, specific 

hybridization can still occur if a lower stringency is used. 
Specificity of hybridization is also dependent on sequence 
length. For example, a longer sequence can have a greater 
number of mismatches with its complement than a shorter 

20 sequence without losing hybridization specificity. Such 
parameters are well known and one skilled in the art will 
know, or can determine, what sequences are substantially 
complementary to allow specific hybridization. 

As used herein, the term "a primer capable of directing" 

25 when used in reference to nucleic acid sequence changes refers 
to a primer having a mismatched base pair or base pairs within 
its sequence compared to the template sequence. Such 
mismatches correspond to the mutant sequences to be 
incorporated into the template and can include, for example, 

30 additional base pairs, deleted base pairs or substitute base 
.pairs. It is understood that either one or both primers used 
for the PCR synthesis can have such mismatches so long as 
together they incorporate the desired mutations into the wild- 
type sequenc . 

35 Thus, the invention provides methods of introducing at 

least one pred t rmined change in a nucleic acid sequence of 
a double-stranded DNA. Such methods include: (a) providing 
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a first primer and a second primer capable of directing saxd 
predetermined change in said nucleic acid seguence, said first 
and second primers comprising a nucleic acid seguence 
substantially complementary to said double-stranded DMA. so as 
5 to allow hybridization, a class IIS restriction enzyme 
recognition seguence and cleavage sites; (b) hybridizing saxd 
first and second primers to opposite strands of said double- 
stranded DNA to form a first pair of primer-templates oriented 
in opposite directions; (c) extending said first pair of 

10 primer-templates to create double-stranded molecules; (d) 
hybridizing said first and second primers at least once to 
said double-stranded molecules to form a second pair of 
primer-templates? (e) extending said second pair of primer- 
templates to produce double-stranded linear molecules 

15 terminating with class IIS restriction enzyme recognition 
sequences; and (f) restricting said double-stranded linear 
molecules with a class IIS restriction enzyme to form 
restricted linear molecules containing said change in saxd 
nucleic acid seguence. 

20 Enzymatic Inverse Polymerase Chain Reaction (EIPCR) is a 

PCR-based method for performing site-directed mutagenesxs! 
Mutations are introduced into a DNA by first hybridizing 
primers which contain the desired mutations to the DNA, 
referred to herein as mutant primers. The resulting primer- 

25 templates are enzymatically extended with a polymerase to 
yield an intermediate product. Repriming of the intermediates 
and polymerase extension will yield the final mutant product. 
Cohesive termini can be subsequently generated for 
circularization of the linear products by intramolecular 

30 ligation. 

The invention is described with particular reference to 
' introducing a predetermined change into a circular template 
and ^circularizing of the product to generate mutant copxes 
of the starting template. However, one skilled in the art can 
35 use the teachings and methods described herein to similarly 
generate mutations in linear templates. The primers desxgned 
for use on lin ar templates are similar to those used for 
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circular templates. Appropriate modifications of primers for 
use on linear templates are known to one skill d in the art 
and will be determined by the intended use of the final mutant 
product. For example, when generating circular products, 
5 either from a linear or circular starting template, it is 
beneficial to use primers containing complementary cleavage 
sites downstream from the class IIS recognition sequence. 
Such complementary sites greatly increase the efficiency of 
intramolecular ligation. With linear molecules, on the other 

10 hand, while it is beneficial in some cases for the primers to 
contain class IIS recognition seguences which produce single- 
stranded overhangs at their cleavage sites, such cleavag 
sites need not be complementary. For example, if the product 
is a linear molecule for subcloning into a vector, cleavag 

15 sites which are not complementary can be used for directional 
cloning of the product. Additiona. , a blunt cleavage site 
can be used to eliminate seguence r<= airements for subcloning. 
Thus, depending on the desired product, the cleavage sites 
within the primers can be complementary or non-complementary. 

20 EIPCR primers are synthesized having three basic seguence 

components. These seguences are used for generating mutations 
and for enabling efficient formation of circular products 
without introducing unwanted seguences or reguiring the use of 
template restriction seguences. The first seguence component 

25 of the primers is the region which directs the predetermined 
changes. This region contains the desired mutations which are 
to be introduced into the template. The length and seguence 
of this region will depend on the number and locations of 
incorporated mutations. For example, if multiple and adjacent 

30 mutations are desired, then the primer will not contain any 
. nucleotides within this region identical to the wild-type 
seguence. However, if the mutations are not located at 
adjacent positions, then the nucleotides in between such 
mutations will be identical to the wild-type seguence and 

35 capable of hybridizing to the appropriat complementary 
strand. Thus, the region can be from one to many nucleotides 
in length so long as it contains the desired mismatches with 
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the wild-type sequence. 

It is only necessary for one of the primers to contain 
the desired mutations but a larger number of bases can be 
mutagenized and a higher efficiency of correct mutations can 
5 be obtained if both primers contain the desired mutations on 
each complementary strand. A strategy for designing EIPCR 
primers is outlined in Figure 2. This strategy shows an 
example of a pair of primers which can be used for mutagenesis 
at two nonadjacent locations. One skilled in the art can use 

10 this strategy and the teachings described herein to design and 
use primers that incorporate essentially any desired mutation 
into a double-stranded DNA. The template containing the wild- 
type sequence is shown in Figure 2A (SEQ ID NO: 1) . Also 
shown are the desired nucleotide substitutions (arrows) . The 

IS actual primers are depicted in Figure 2B as the shaded 
sequence (SEQ ID NO: 2; SEQ ID NO: 3). The region of each 
primer containing the desired substitutions is complementary 
and corresponds to the opposite strand at the same location 
within the template (Figure 2C) (SEQ ID NO: 4) . For primers 

20 A (SEQ ID NO: 2) and B (SEQ ID NO: 3) in Figure 2B, the mutant 
region would consist of the sequence GTTCC and its complement, 
respectively . 

The second sequence component of EIPCR primers is the 
region containing the class IIS restriction enzyme recognition 

25 sequence. The location of the recognition sequence is 5 • to 
the mutant region and thus is incorporated at the termini of 
any extension products. Since recognition sequences are 
located at the ends of linear extension products, they can 
also contain additional 5- sequences to facilitate recognition 

30 and cleavage by a class IIS enzyme. For example, the primers 
- in Figure 2B (SEQ ID NO: - 2; SEQ ID NO: 3) contain four 
additional nucleotides 5' to the Bsa I recognition sequence 

(SEQ ID NO: 5) . 

Other sequences included within the recognition sequence 
35 component of EIPCR primers are the nucleotides between the 
recognition sequence and the cleavaqe site. The number of 
nucleotid s will correspond to the distance betw en thes two 
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sites and therefore will vary for different enzymes. For 
example, th primers of Figure 2 contain a Bsa I recognition 
sequence which is cleaved by Bsa I on opposite (SEQ ID NO: 5) 
strands one and five nucleotides, respectively, 3' to the 
5 recognition sequence, leaving a four nucleotide single-strand 
overhang. Generally, such overhang sequences within the 
primers are completely complementary to each other but can 
include limited mutations. Primers are synthesized with 
filler nucleotides placed 5« to the first cleavage site. Th 

10 number of filler nucleotides corresponds to the distance 
between the particular class IIS recognition sequence used and 
its cleavage site. The sequence of such spacer nucleotides 
can, for example, correspond to wild-type or non-wild-type 
sequences or to predetermined mutations. For generating just 

15 a few point mutations, it is beneficial to match these 
nucleotides to the wild-type sequence to increase the 
hybridization stability of the adjacent mutant primer region. 

Types of restriction enzyme recognition sequences to be 
used in the invention are those recognized by class IIS 

20 enzymes. These enzymes recognize the DNA through a sequence 
specific interaction and cleave it at a discrete distance 
downstream from the recognition sequence. The ability to 
cleave such sequences downstream provides a useful means to 
remove heterogeneous ends and to produce complementary termini 

25 for circularization while at the same time removing the 
recognition sequence from the final product. Specific 
examples of class IIS recognition sequences have been list d 
previously and are also listed in Figure 3 along with their 
nucleotide sequences and cleavage sites (SEQ ID NOS: 5 through 

30 20) . Although recognition sequences having complementary 
cleavage sites associated with them are preferred,, those which 
have blunt ended cleavage sites can also be used in the 
invention. 

The third sequence component of EIPCR primers is the 
35 region to be hybridized to the template DNA. This region must 
be sufficient in 1 ngth and sequence to allow specific 
hybridization to the template. The hybridized portion of the 
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primers must also form a stable prim r-template which can be 
used as a substrate f r polymerase extension. It is typically 
found 3- to the mutant primer region and its sequence is 
determined with respect to the location of the desired 
5 mutations. For example, for the primers shown in Figure 2 
(SEQ ID HO: 2; SEQ ID HO: 3), the hybridization region is 
twenty nucleotides in length and found 3- to the mutant 
region. However, the hybridization region can also be 5» to 
the mutant region. For this orientation, the mutant regxon 

10 must form a stable primer-template which can be used as a 
substrate for polymerase extension. Longer or shorter 
hybridization sequences can be used in this region so long as 
they are appropriately located with respect to the mutant 
region and also specifically hybridize to the template 

15 molecule. One skilled in the art knows or can readily 
determine the specificity of such hybridization regions for 

use in EIPCR primers. 

Thus, the invention also provides a synthetic primer for 
introducing at least one predetermined change in a nucleic 

20 acid sequence of a double-stranded circular DHA. The prxmer 
includes: (a) a class IIS restriction enzyme recognitxon 
sequence; (b) said predetermined change in said nuclexc acxd 
sequence; and (c) a nucleic acid sequence substantially 
complementary to said double- stranded DHA. The preferred 

25 orientation of the above regions (a) through (c) is xn a 5- to 
3 r direction. 

The above described primers can be, for example, 
hybridized to a double-stranded circular or linear DHA 
molecule which has first been denatured. Denaturation can be 
30 performed, for example, using heat or an alkaline solution. 

Other methods known to one skilled in the art- can also be 
used. 

Hybridization of the primers occurs on opposite stranos 
of the circular template and in a location where the single- 
35 stranded overhangs of each primer's complementary cleavage 
site can be joined together by restriction and ligatxon. 
Preferably, such joining should occur so that the wild-type 
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sequ nee is reformed except for the incorporation of the 
desired mutations. One way to ensure proper sequence 
reconstruction is to design the primers such that their 
complementary cleavage sites overlap and are either identical 
5 to the template sequence or contain some or all of the desired 
mutations. Such primers, once hybridized to a double-stranded 
circular DNA, form primer- templates and can be extended with 
a polymerase. The first extension reactions of circular 
templates result in the synthesis of double-stranded circular 

10 products which can be concatenated. Depending on the extent 
of polymerization, t'.e concatemers can be either partially or 
completely double-stranded. It is necessary for 

polymerization to proceed sufficiently far to allow subsequent 
primer hybridization for a second extension reaction. Smaller 

15 circular DNAs result in a greater number of completely double- 
stranded products and also require shorter extension times 
compared to much larger circles. Small circular DNAs of less 
than 1.0 kb are known in the art. Such vectors are beneficial 
to use in the invention since they can accommodate large 

20 inserts (3 to 5 kb) and still be comparable in size to most 

standard cloning vectors. The plasmid pVX is a specific 
example of a 902 bp vector, Seed, B., Nuc. Acids Res. 11:2477- 
2444 (198?), which is incorporated herein by reference. Such 
vectors can be further modified by the addition of, for 

25 example, promoters, terminators and the like to achieve the 
desired end. Complete extension of a circular DNA of about 
5.0 kb can be achieved using the conditions described herein; 
however, alternative conditions used by those skilled in the 
art to achieve complete extension of larger circular DNAs can 

30 also be used to practice the invention. For linear templates, 
on .the other hand, the- first extension reaction produces a 
double-stranded linear molecule known in the art as the long 
product. 

After one extension reaction, the double-stranded 
35 products, whether they exist as circular or linear molecules, 
have incorporated at one of th ir ends the EIPCR primer with 
its associated class IIS restriction nzyme r cognition 
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sequence and the desired mutati ns. These double-stranded 
molecules can be used for a second cycle of hybridization and 
extension to produce double-stranded linear molecules which 
terminate at both ends with EIPCR primers. Further cycles 
will result in the exponential amplification of template 
sequence located between each primer on the circular DNA. 
Thus, the location of the hybridized primers defines the 
termini of template sequences to be amplified. 

Polymerases which can be used for the extension reaction 
include all of the known DBA polymerases. However, if 
multiple cycles of hybridization and extension are to be 
performed, such as required for PCR amplification, then 
preferably a thermostable polymerase is used. Thermostable 
polymerases include, for example, Taq polymerase, Vent 
polymerase and PFU polymerase. Vent and PFTJ polymerase 
advantageously exhibit a higher fidelity than Taq due to their 
3« to 5 1 proofreading capability. 

Following synthesis of the linear molecules, the products 
are restricted with the appropriate class IIS restriction 
enzyme to remove the class IIS recognition sequence and 
heterogeneous termini and to create cohesive termini used for 
circularization. The resulting termini correspond to the 
single-strand overhangs produced after restriction of each 
primer's complementary cleavage site. To facilitate proper 
recognition and cleavage, the linear products can be pre- 
treated with a polymerase, such as Klenow, under conditions 
which create blunt ends. This procedure will fill in any 
uncompleted product ends produced during amplification and 
allows efficient restriction of essentially all of the 
products. After restriction, the cohesive termini can be 
joined to recircularize the linear molecule. Covalently 
closed circles' can subsequently be formed in vitro with a 
ligase. Alternatively, in vivo ligation can be accomplished 
by introducing the circularized products into a compatible 
host by transformation or electroporation, for example. 

Transformation or electroporation of the circularized 
products can additionally b used for the propagation and 
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manipulation of mutant products. Such techniques and their 
uses are known to one skilled in the art and are described, 
for example, in Sambrook et al., Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor, Cold Spring Harbor, NY 
5 (1989), or in Ausubel et al., Current Protocols in Molecular 
Biology, John Wiley and Sons, New York, NY (1989), both of 
which are incorporated herein by reference. Propagation and 
manipulation procedures do not have to be performed at the end 
of all EIPCR reactions. The need will determine whether such 

10 procedures are necessary. For example, transformation and DNA 
preparation can be eliminated if two consecutive EIPCR 
reactions are to be performed where the product of the first 
reaction is used as the template for the second reaction. All 
that is necessary is that the first reaction products are 

15 circularized and iigated prior to hybridization with the 
second reaction primers. Additionally, primers for EXPCR can 
be used without purification. EIPCR is not as sensitive as 
other methods to the presence of primers of incomplete length 
because the non-uniform DNA ends are removed by restriction of 

20 the class IIS recognition sequence. 

The invention further provides methods of producing at 
least two changes located at one or more positions within a 
nucleic acid sequence of a double-stranded circular DNA. The 
methods include: (a) providing a first population of primers 

25 and a second population of primers capable of directing said 
changes in said nucleic acid sequence, said first and second 
populations of primers comprising a nucleic acid sequence 
substantially complementary to said double-stranded DNA so as 
to allow hybridization, a class IIS restriction enzyme 

30 recognition sequence, and cleavage sites; (b) hybridizing said 
first and second populations of primers to opposite strands of 
said double-stranded DNA to form a first pair of primer- 
template populations orientated in opposite directions; (c) 
extending said first pair of primer-template populations to 

35 create a population of double-stranded molecules; (d) 
hybridizing said first and s cond populations of primers at 
least once to said population of double-stranded molecules to 
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form a second pair f primer-template populations; (e) 
extending said second pair of primer-template populations to 
produce a population of double-stranded linear molecules 
terminating with class IIS restriction enzyme recognition 
5 sequences; and (f ) restricting said population of double- 
stranded linear molecules with a class IIS restriction enzyme 
to form a population of restricted linear molecules containing 
said changes within said nucleic acid sequence. Also provided 
is a population of synthetic primers for producing at least 
10 two changes located at one or more positions within a nucleic 
acid sequence of a double-stranded circular DMA comprising: 
(a) a class IIS restriction enzyme recognition sequence; (b) 
said changes within said nucleic acid sequence; and (c) a 
nucleic acid sequence substantially complementary to said 
15 double-stranded circular DNA. 

The method for producing at least two changes located at 
one or more positions is similar to that described above for 
site-directed mutagenesis except that the primers can have 
more than one nucleotide at a desired position. For example, 
20 if it is desirable to produce mutations incorporating from two 
to four different mutant nucleotides at a particular position, 
then a population of primers should be synthesized such that 
all mutant nucleotides are represented within the entire 
population. Each individual primer within the population will 
25 contain only a single mutant nucleotide. The proportion of 
primers containing identical mutant nucleotides will determine 
the expected frequency of that mutation being correctly 
incorporated into the final product. For example, if only two 
mutant nucleotides are desired and each one is equally 
represented within the primer population, then 50% of the 
products should contain one of the mutations and 50% should 
contain the other mutation. If more than two mutations are 
desired at a particular position or at more than one position, 
then primer populations should be synthesized which contain 
35 individual primers having each of the desired mutations. 

Prim r populations can also be synthesized which direct single 
mutations at one position and multiple mutations at another 
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position by incorporating one or more mutant nucl otides at 
the appropriat p sition. 

The design and use of such primers is identical to that 
previously described for introducing at least on 
5 predetermined change into a double-stranded circular DNA. The 
only difference is that instead of hybridizing a first primer 
and a second primer to form a pair of primer-templates, 
hybridization is with a first population of primers and a 
second population of primers to form a pair of primer-template 

10 populations. Each primer-template within the population can 
include, for example, one of the desired mutant sequences to 
be incorporated into the resultant products. Amplification of 
the primer-template population will produce a population of 
linear products containing all desired mutations. The 

15 products can be restricted, circularized and screened for 
individual mutant clones. Screening can be performed, for 
example, by sequencing or by expression of polypeptide. 
Selection can be performed by linking polypeptide expression 
with the expression of a suitable marker such as an antibiotic 

20 resistance gene, luciferase, or the like. Only coloni s 
containing the gene are selected. Following selection, 
positive colonies can then be screened for a particular 
characteristic. Expression screening or selection offers the 
advantage of screening or selecting a large number of clones 

25 in a relatively short period of time. These assays permit the 
identification of clones of interest. Examples of screening 
and selection assays are well known to those with skill in the 
art. Each assay is designed and modified for that particular 
application. Examples of these assays are found in the 

3 0 examples below. 

The methods and primers described herein can be used to 
create essentially any desired change in a nucleic acid 
sequence. Templates can be linear or circular and result in 
products containing only th desired changes since class IIS 

35 recognition sequences allow the removal of extraneous and 

unwanted sequences. Product termini which are homogeneous in 
nature are also produced using the class IIS recognition 
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s quences. Use of circular templates allows the incorporati n 
of mutations at any desired 1 cation along the templet with 
subsequent recircularization of the mutant products. Thus, 
additions, deletions and substitutions of single base pairs, 
5 multiple base pairs, gene segments and whole genes can rapidly 
and efficiently be produced using EIPCR. A specific use of 
EIPCR would be in the mutagenesis of antibodies or antibody 
domains. Mutagenesis of antibody complementary determining 
regions (CDR) , for example, can be performed using EIPCR for 

10 the rapid generation of antibodies exhibiting altered binding 
specificities. Likewise, EIPCR can also be used for producing 
chimeric and/or humanized antibodies having desired 
immunogenic properties. 

The efficiency of incorporating correct mutations into 

15 the product using EIPCR can be, for example, greater than 
about 90%, preferably about 95 to 99%, more preferably about 
100%. This efficiency is routinely obtained when using about 
0.5 to 2.0 ng of template in a 25 cycle PCR reaction. 
However, it should be understood that the efficiency directly 

20 correlates with the number of amplification cycles and 
inversely with the amount of template used. For example, the 
more amplification cycles which are performed, the greater the 
amount of mutant product present and therefore a larger 
fraction of mutant sequences will be present within the total 

25 sequence population. Conversely, if a large amount of 
template is used, more amplification cycles are required, 
compared to using a smaller amount of template, to achieve the 
same fraction of mutant sequences within the total sequence 
population. One skilled in the art knows such parameters and 

30 can adjust the number of cycles and amount of template 
required to achieve the required efficiency. 

The following examples are intended to illustrate but not 
limit the invention. 
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This example shows th use f EIPCR for sit -directed 
mutagenesis f two bases located on a 2.6 kb pUC-based plasmid 

(designated pl86) . 

The design of the primers and their relationship to the 
5 template and to the final mutant sequence is shown in Figure 
2. The 3» end of the primer is an exact match of 20 bases. 
The 5« ends of the primers comprise the enzyme recognition 
site and the enzyme cut site, which was designed to form 
complementary overhangs. Four additional bases were added 5' 
10 to the enzyme recognition sequence to facilitate recognition 
and digestion of the PCR product by the enzyme. Two 
complementary mutations were designed into each of th 
primers. Bsa 1 was the enzyme used to make the overhangs 
(Figure 3) . 

15 PCR reactions were performed in 100 m1 volumes containing 

0.2-1.0 uM of each unpurified primer, 0.5 ng uncut plB6 
template plasmid DNA, lx Vent buffer, 200 uM of each dNTP, 2.5 
units Vent polymerase (New England Biolabs, Beverly, MA) . 
Thermal cycling was performed on a Perkin-Elmer-Cetus PCR 

20 machine (Emeryville, CA) with the following parameters: 
94°C/3 minutes for 1 cycle; 94'C/l minute, 50°C/1 minut , 
72°C/3-4 minutes for 3 cycles; 94°C/1 minute, 55°C/1 minute, 
72-C/3-4 minutes, with autoextension at 4-6 sec/cycle for 25 
cycles; followed by one 10 minute cycle at 72°C. 

25 To blunt the ends of the PCR product, the entire reaction 

mix was supplemented with 8 ul of 10 mM of dNTP mixture (2.5 
mM each) and 20 units of Klenow fragment (Gibco-BRL, 
Gaithersburg, MD) incubated at 37°C for 30 minutes. The 
reaction was then extracted with an equal volume of 

30 phenol/chloroform (1:1), ethanol-precipitated, and the pellet 
' was washed and dried. The blunt end product was th n 
restriction digested with Bsa I (New England Biolabs, Beverly, 
MA) as recommended by the manufacturer. The digested DNA was 
xtracted with an equal volume of phenol/chloroform, ethanol- 

35 precipitated, as described above, and ligated with 20 units T4 
DNA ligase (Gibco-BRL) for one hour at room t mperature. Gel- 
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purification of the digested DNA before ligation was not 
necessary. After ligation, the DNA was transformed into 
competent DH10B cells recommended by the manufacturer (Gibco- 

BRL) . . ^ . 

5 Approximately 400 colonies were obtained from a 

transformation using 10 ng of DNA into 30 ul of frozen 
competent cells. The transformation efficiency was 4x10 
cfu/ug of DNA. Seven colonies were randomly picked and 
plasmid DNA was prepared for restriction digests. No 

10 • differences in restriction pattern were seen. The mutated 
areas of the plasmids of these seven colonies were sequenced. 
Double-stranded dideoxy sequencing was performed on a Dupont 
Genesis 2000 automated sequencer using the Dupont Genesis 2000 
sequencing kit. The sequences of all seven plasmids contained 

15 the desired mutation. 

EXAMPLE II 

This example shows the use of EIPCR for constructing 
20 large libraries of protein mutants. 

The binding site of an antibody, called the Fv fragment, 
normally consists of a heavy chain and a light chain, each 
about 110 amino acids long. Using molecular modelling tools, 

25 several groups have constructed single chain IV fragments 
(scFv) in which the c-terminus of one chain is connected by a 
10-15 amino acid linker to the n-terminus of the other chain 
(Huston, Bird, Glockshuber) . The single chain construct was 
shown to be much more stable than the two chain Fv. 

30 To eliminate the need for molecular modelling, EIPCR was 

used to make a large library of different linkers and sere n 
* * for a scFv clone that is not only active but also expressed at 
a high level. An antibody was chosen that binds a radioactive 
indium chelate, Reardan et al. , Nature 316:265-267 (1985), 

35 which is incorporated herein by r ference. A 3.5 kb pUC- 
derived plasmid was constructed in which both Fv chains are 
attached to ompA leader peptides and driven by a Lac promoter 



WO 93/12257 



PCT/US92/10647 



-25- 

(Figure 4). This plasmid was used as the template for EIPCR 
in which the DNA between the c-terminus of the first chain and 
the n-terminus of the mature second chain was replaced by a 
random mixture of bases, encoding <- library of random linkers. 
5 The design of the primers is shown in Figure 4B in the shaded 
region where N represents an equal proportion of all four 
nucleotides at the position within the primer population. 

Synthesis of the two primer populations used to construct 
the library was performed on a Milligen/Biosearch 8700 DNA 

10 synthesizer. The mixed base positions were synthesized using 
a 1:1:1:1 mixture of each of the four bases in the D 
reservoir. The oligonucleotides were made trityl-on and were 
purified with Nensorb Prep nucleic acid purification columns 
(NEN-Dupont, Boston, MA) as described by the manufacturer. 

15 pcr reactions were performed in 100 ^1 volumes containing 

0.5 uM of each unpurified primer, 0.5 ng pUCHAFvl template 
plasmid DNA, Ix Taq buffer, 200 uM of each dNTP, 1 ul Tag 
polymerase (Perkin-Elmer-Cetus). Thermal cycling was 
performed on a Perkin-Elmer-Cetus PCR machine with the 

20 following parameters: 94°C/3 minutes for 1 cycle; 94°C/1 
minute, 50°C/1 minute, 72°C/2 minutes for 3 cycles; 94°C/1 
minute, 55°C/1 minute, 72°C/2 minutes, with autoextension at 
4 sec/cycle for 25 cycles; followed by one 10 minute cycle at 
72°c. 

25 The product of the 100 ul PCR was extracted with an equal 

volume of phenol /chloroform (1:1), ethanol-precipitated , and 
the pellet was resuspended in 20 ul KKL buffer (50 mM Tris-HCl 
pH 7.6, 10 mM MgCl2, 5 Mm DTT; suitable for Klenow, Kinase and 
Ligase) containing 200 dNTPs, 1 mM ATP, 10 units DNA 

30 Polymerase Klenow fragment and 10 units T4 DNA Kinase and 
incubated at 37°C for 30 minutes. Then 10 units T4 DNA ligase 
were added, and the reaction was continued for 2 hours at room 
temperature. The enzymes were then inactivated by heating at 
65°C for 10 minutes. Th polymeriz d DNA was then digested 

35 with Bbs I (NEB) which cuts off the ends of th PCR fragment, 
inside the oligos. It was found that Bbs I digestion was 
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inefficient with only four bp 5- to the recognition sequ nee. 
To create a longer 5' extension and improve efficiency, the 
DNA was ligated before digestion. Alternatively, primers 
could have been synthesized with longer 5' extensxons. The 
digested DNA was then extracted with phenol/ chloroform, 
ethanol precipitated, and resuspended in 20 ul lx NEB ligation 
buffer, containing l mM ATP and 10 units T4 DNA ligase and the 
reaction was incubated for 2 hours at room temperature. 

One microliter amounts of the ligation reaction were 
electroporated into 20 ul of DH10B Electromax cells (Gibco- 
BRL, Gaithersburg, MD) to produce a library of scFv 
constructs. The Gibco-BRL electroporator and voltage booster 
was used as recommended by the manufacturer. Cells were 
plated at 3,000 cfu/plate on plates containing 0.05 mM IPTG, 
15 to induce Fv expression. 

For screening, the labelled chelate was prepared by 
incubating 10 ul of 0.075 mM Eotube chelate with 50 uCi of 
buffered "'indium Chloride in a metal free tube. Colony lifts 
of the petri plates containing the protein library were 
prepared using BA83 nitrocellulose filters (Schleicher and 
Schuell, Keene, NH) . The filters were blocked by incubation 
in Blotto (7% non-fat milk in PBS) for 10 minutes, washed with 
PBS, followed by incubation in Blotto containing 10 ucx of 
'"indium Chloride per filter for 1 hour at room temperature. 
The filters were then washed repeatedly with PBS for a total 
of 15 minutes, dried and exposed to Kodak X-omat AR 
autoradiography film for several hours. 

The quality of the protein library was determined by DNA 
sequencing of the linker of several unscreened clones. 
30 Sequencing was performed as described in Example I. The 
composition of the mixed site residues- was 19% G r 31% A, 25% 

T, 25% C (n=119). 

The size of the library was determined by plating. In a 
typical electroporation, 30,000 cfu's were obtained from 
electroporation of 1 ul of ligation mixture into 20 ul of 
cells. The ligation contained 0.1 ug of DNA in 20 ul. The 
library size was about 3x10 s recombinants and the 
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electroporation efficiency was 6xl0 6 cfu/ug. Approximately 
30,000 clones were scr ned, and about 60 colonies gave a 
range of signals on the primary screen (0.2%) . Those with the 
strongest signal were colony purified and the DNA sequence of 
5 the linker was determined. The sequences of one linker from 
an identified scFv clone is shown in Figure 4C. 

LIBRARY MffiC&BEBgSIfi 

Library mutagenesis using a heterogenous primer 
population permits incorporation of a large number of 

10 mutations into a population of host ceils to generate a 
recombinant library. The resulting mutations are typically 
introduced into a polynucleotide suitable for cell delivery. 
The polynucleotide can additionally be adapted for expression. 
These polynucleotides may contain hanges in either the 

15 regulatory region of the polynucleotAna or in a translatable 
region. The directed mutations in tht lynucleotide sequence 
may alter levels of protein express io*, alter a functional 
characteristic of a protein, or confer a particular, cell 
phenotype. The incorporation of a large number of mutations 

20 into a host population is termed library mutagenesis. In 
general, libraries can be prepared and screened for c anges in 
any measurable cell property. Similarly, the tranr ovt&d or 
transfected cells containing the altered nuc, \z acid 
sequences can be screened or selected for c. desired 

25 polynucleotide sequence independent of polypeptide expression. 

There are several different methods for performing 
library mutagenesis that are available to those of skill in 
the art. A number of these methods use PCR to produce a 

30 library of mutant constructs. However, none of the existing 
methods for making mutant libraries are based on inverse PCR. 

Enzymatic Inverse PCR (EIPCR) amplifies the entire 
plasmid, a portion of the plasmid or linear sequence of a 
polynucleotide. These methods differ from other mutagenesis 

35 methods in the use of class IIS restriction sequences in the 
5« end of both primers. Digestion with class IIS restriction 
enzymes, such as Bsal (GGTCTCN'NNNN) , which hav their 
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recognition sequence >• to. end separat d fro-, their cleavage 
Hte'euoe. the reeovai of the entire region —e 
prior to ligation. This preferably leave, the .. 
product with compatible overhangs at each end Intra 
B Llecuiar ligation of the PCR product yields . full-length _ 

circular plasmid. 

An important advantage of EIPCR library mutagenesxs xs 
that any plasmid or DMA fragment can be used to create a 
that any P limitation is the efficiency 

library of mutations. Tne oiu-y xjjuj-w 
u, of the PC* process. The generation of a complementary strand 
is lifted by the length of the template and by the elongation 
rate of the polymerase. It is likely that advances xn the PCR 
technology, in particular, enzyme efficiency, will permxt long 
tecnnox gy, F invention. The library 

DMA fragments to be usea xn . 
XS mutagenesis methods disclosed herein are rapxd and eff xcxent 
and permxt one of skill in the art to generate several 
libraries in a day. For example, once primers are prepared, 
libraries such as those prepared in Example III can be 
generated in 6 to 10 hours. 

in EIPCR library mutagenesis, the 
amplified using mutagenic primers. The simple desxgn of EIPCR 
results in a high efficiency of ligation of mutant plasmids, 
^generating a high level of diversity in the library The 
higher the level of genetic diversity in a recombxnant 
25 library, the more likely the library will contain a mutant of 
' interest readily identifiable by methods known to on. of skxll 
1, the art . Another important benefit cf EIPCR over other 
methods for library mutagenesis is that, as in EIPCR sxte- 
directed mutagenesis, mutations can be made xn any area of the 
30 teguence independent of available restriction seguences 
Restriction endonuclease recognition sxtes are not 
incorporated into the final ccnstruct. The usefulness of EIPCR 
for library mutagenesis, is described in Example III and 

illustrated in Figure 5. „ =n „„te 
A method for performing library mutagenesxs to generate 
a recombinant library by introducing changes withxn a 
predetermined region of linear or, preferably, circular double 
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stranded DNA is contemplated herein. The m thod c mpris s (a) 
providing a first primer population and a second prim r 
population, each having at least one variable base at kn wn 
complementary positions along the primers capable of directing 
5 a change in the nucleic acid sequence, the first and sec nd 
primer populations being substantially complementary to the 
double-stranded nucleic acid to allow hybridization thereto 
and having a class IIS restriction enzyme recognition sequence 
and cleavage sites, (b) hybridizing the first and second 

10 primer populations to opposite strands of the double stranded 
nucleic acid to form a first pair of primer-templates oriented 
in opposite directions, (c) performing enzymatic PCR as herein 
before described, (d) cutting the double stranded linear 
molecules with a class IIS restriction enzyme to form 

15 restricted linear polynucleotide sequences containing the 
change in said nucleic acid sequence, thereby removing 
restriction endonuclease recognition sites, (e) optionally 
joining termini of the restricted linear molecules of step (d) 
to produce a double-stranded circular polynucleotide sequence, 

20 and (f ) introducing polynucleotide sequence obtained from step 

(d) or (e) into compatible host cells. 

The term "primer population" is used to describe the pool 
of primers that have identical base compositions except at 
certain predetermined locations along the sequence that 

25 contain a variable composition. The primers for EIPCR library 
mutagenesis are otherwise designed similar to those primers 
used for EIPCR site-directed mutagenesis. Primer pairs for 
EIPCR mutagenesis are designed to hybridize to the top and 
bottom strands of a double stranded template and to extend in 

30 opposite directions. The primers are chosen to be 
substantially complementary to that region of the nucleic acid 
template to be mutagenized. These primers may be overlapping 
on the template, contiguous, or non-overlapping. The 
> primer pairs are substantially complementary to the template 

35 to facilitate hybridization during the PCR process. 

Preferably, the primer contains at least a 15 base region at 
the 3 • end of the primer that is complementary to the 
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template. Other regi ns f complementarity may be 
interspersed throughout the length of the primer. The prim r 
additionally contains a class IIS restriction endonuclease 
recognition sequence and a region containing noncomplementary 
5 bases that confers the desired variable mutation. The 
variable region can be of any length, the only restriction on 
length being the ability of the primers to hybridize to the 
template and direct synthesis of a substantially complementary 
strand of DNA. Further, the variable region or regions may be 

10 interspersed between complementary regions along the primer 
strand. Filler base regions can additionally be added to the 
primer at the 5 » end of the primer, before the class IIS 
recognition sequence, and between the class IIS recognition 
sequence and the class IIS cleavage site. Any final primer 

15 length is contemplated within the scope of the invention. 
Primer length is limited only by the efficiency of the 
oligonucleotide synthesizer. Primers may be prepared by 
methods known to those of skill in the art. Those with skill 
in the art will be readily able to determine if a given primer 

20 adequately hybridizes to a given template and is thus suitable 
for amplification using EIPCR. 

The extent of primer variability desirable for library 
mutagenesis is determined during primer synthesis. A mixture 
of nucleotides, or polynucleotides such as amino acid encoding 

25 trimers, are introduced at one or more positions along the 
primer oligonucleotide. The addition of trinucleotide 
fragments during synthesis provides direct control over amino 
acid mixtures. The nucleotide mixture is formulated to 
contain a predetermined percentage of each of the four bases. 

30 These percentages may vary from 0% to less than 100% for any 
one base and from 0 to 100% for each of the 64 amino acid 
encoding trimers. The frequency of a given sequence is 
determined by the desired probability that a particular base 
or trimer will be present at a particular position along the 

35 primer. Thus, for example, if the library is to contain 
variable mutations at position 6 of the primer oligonucleotide 
corresponding to a 75% average likelihood that position 6 is 
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guanosine and a 25% average likelihood that position 6 will be 
adenosin , then the elongating primer will be exposed to a 
mixture of 3/4 guanos ine and 1/4 adenosine at position 6. 
These mixtures can also be prepared in proportions such that 
5 for a region of 10 bases it is likely that on average only one 
of the 10 bases in any primer is different from the template 
sequence. This provides a primer pool that theoretically 
represents every possible permutation in each nucleotide 
position over a 10 base pair sequence. A review of primer 
10 preparation and design in random mutagenesis can be found in 

Oligonucleotides and Analogues; A practice^ ftpproact) (F. 

Eckstein Ed. , Oxford University Press, 1991) and Hermes et 
al., Gene 84:143-151, 1989, which is hereby incorporated by 
reference. 

15 As illustrated in Figure 6 the primer pairs contain a 

complementary region at the class IIS restriction endonuclease 
cleavage site. In EIPCR library mutagenesis, this overlapping 
region preferably does not contain a mutation. This ensures 
that recircularization of the template can occur following PCR 

20 amplification. In the examples that follow, class IIS 

restriction endonuclease Bsal is used to generate a four base 
overhang at each end of the nucleotide sequence. Figure 3 
provides an exemplary list of other class IIS restriction 
endonucleases, contemplated within the scope of this 

25 invention. 

Library mutagenesis can be used to alter any region 
within a nucleic acid sequence. These mutagenesis procedur s 
are particularly useful for generating a library of mutations 
within the mature region of a protein sequence, within a 

3 0 leader sequence, or within sequences that do not encode 
protein. Sequences that do not encode protein may influence 
or regulate protein expression. These include, but are not 
limited to non-coding regions on the DNA, for exampl , 
enhancer sequences, promoter regions, sites for DNA binding 

35 proteins such as repressors, Z-DNA formation, matrix 
associated regions, telomeres, origins of replication and 
recombination signals. In addition to those non-coding 
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regions on the DNA that are transcribed, non-coding regions n 
mRNA additionally contemplated include, but are not limited to 
snRNP's, spliceosomes, ribosome binding sites, regions of 
secondary structure, terminators, stability sites and cap 
sites- It is additionally contemplated within the scope of 
this invention that EIPCR library mutagenesis can be used to 
generate recombinant libraries containing altered seguences 
corresponding to tRNA or rRNA. Mutations in regulatory regions 
of a nucleic acid seguence can effect the level of protein 
expression, while in-frame substitution mutations within the 
nucleic acid seguence encoding protein can effect protein 
function. It is therefore contemplated that the procedures 
described herein will be useful for generating recombinant 
libraries having mutations in any of these aforementioned 
15 regions of the nucleic acid. 

EIPCR library mutagenesis can be used to alter the 
functional characteristics of a particular protein. A protein 
seguence engineered into an expression construct can be used 
as a nucleic acid template for EIPCR library mutagenesis. 
20 Like other forms of library mutagenesis, this procedure can be 
used, for example, to mutagenize a binding region on a 
polypeptide, thereby generating an expression library that can 
be screened or selected for altered binding characteristics. 
EIPCR mutagenesis can also be employed to mutate a region of 
25 a polypeptide seguence that influences intra-molecular 
binding. For example, a polypeptide region that links two 
protein domains involved in ligand binding can be mutated, 
using the methods disclosed herein, to optimize the 
interactions between the protein domains. 
30 One type of mutagenesis contemplated within the scope of 

• this invention is wobble base library mutagenesis using EIPCR. 
Wobble base mutagenesis incorporates mutations within the 
primer population in positions that correspond to the third 
position of a nucleotide codon. Most mutations in the third 
35 position of a codon do not alter the amino acid seguence of 
the resulting polypeptide. Accurate tRNA— mRNA pairing is 
reguired at the first two positions within the codon during 
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translation. Th third posit i n can tolerate pairing with 
more than one tRNA and this degeneracy is termed a "wobble". 
Thus the same amino acid sequence can be derived from several 
different nucleotide sequences. 
5 * Alterations in the nucleotide sequence that do not affect 

the protein sequence may alter the level of protein synthesis 
or expression within a given host. In particular , alterations 
in the nucleic acid sequence of the leader portion of a 
polypeptide can influence levels of protein synthesis from one 

10 protein to another or from one host to another. An example of 
two primers designed to confer alterations in the OmpA leader 
sequence that result in increased levels of antibody Fv 
fragment expression from E. coli is found in Figure 6. Once 
a leader sequence is optimized for the expression of one 

15 particular polypeptide, using EIPCR library mutagenesis, 
within a given host, it is further contemplated that this 
leader sequence can then be linked to other gene sequences 
encoding polypeptide to optimize expression of other 
polypeptide. Similarly it is also contemplated within the 

20 scope of this invention that other regulatory regions can be 
optimized using EIPCR library mutagenesis and that these 
optimized regions can be engineered into other expression 
constructs for maximal expression of other polypeptides in 
vitro or in vivo. 

25 The invention is preferably designed to incorporate one 

or more random changes within predetermined regions of a 
circular template, such as a vector. Vector choice is 
determined first by the choice of host cell used to create the 
desired library. It is well known to those of skill in the 

30 art that vectors are commercially available for protein 
expression in prokaryotic and eukaryotic systems. Expression 
vectors are available for bacteria, yeast and mammalian 
systems. In addition, viral vectors for both eukaryotic and 
prokaryotic cells are also contemplated within the scope of 

35 this invention. Expression vectors are required when the 
translation products from the mutated nucleic acid sequences 
are to be assayed. An analysis of random mutations in nucleic 
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acid may not require the use of an express! n vect r where 
mutations can be screened using p lynucleotide pr b s r the 
like. Those with skill in the art will be able to choose an 
appropriate commercially available vector, create their own 
vector, or recreate the exemplary vector described in Example 
V below. 

It is additionally contemplated within the scope of this 
invention that EIPCR library mutagenesis could be performed on 
one region of nucleic acid within a construct, and a second 
(and/or subsequent) mutagenesis procedure be performed on 
another region of a construct or on a separate nucleic acid 
construct. Following amplification, these sequences can then 
be combined to produce a construct with two or more regions of 
random mutagenesis. 

A general description of the hybridization of aliquots of 
the first and second primer pools to the nucleic acid template 
as well as a general description of EIPCR are disclosed in the 
detailed description of site-directed mutagenesis beginning on 
page 16. The term "inverse" in enzymatic inverse polymerase 
chain reaction is used to describe the primer pair orientation 
during the PCR process such that at the initiation of 
elongation the 3' end of the primers are directed away from 
one another. The mechanics of hybridization and nucleic acid 
sequence amplification in library mutagenesis are similar to, 
if not identical to, those employed in EIPCR site-directed 
mutagenesis and will not be repeated here. Thus, the term 
"performing EIPCR" as a step in the production of a library of 
mutations following the hybridizing step of the primers to the 
template, comprises 1) extending the first pair of primer- 
templates to create double stranded molecules; 2) denaturing 
the primer templates; 3) hybridizing the first and .second 
primers at least once to the double stranded molecules to form 
a second pair of primer-templates; 4) extending the second 
pair of primer-t mplates following hybridization to produce 
double-stranded linear molecules terminating with class IIS 
restriction enzyme recognition sequences; and 5) repeating 
steps 1-3 as needed. 
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Once mutat d linear template has been generated in 
sufficient quantity, the appropriate class IIS r strict ion 
enzyme is used to cleave the nucleic acid to create termini 
compatible for ligation. Ligation of the linear molecules is 
5 performed under conditions that favor recircularization of the 
plasmid. These conditions are well known to those with skill 
in the art and exemplary conditions are described in Example 
III. 

The nucleic acid is next introduced into the desired host 

10 cells. The nucleic acid can be introduced into the host cells 
by any means known to those of skill in the art. These 
methods include, but are not limited to methods to prepare 
competent bacterial cells including CaCl 2 treatment, and 
methods to transfect eukaryotic cells including CaPO^ 

15 precipitation, liposome mediated transf ection, viral 
infection, or electroporation. The method for introducing 
nucleic acid into the host cell will, in part, be determined 
by the host cell type. Descriptions of each the 
transformation and transfection procedures are found in 

20 recombinant methodology handbooks including those of Sambrook 
et al. or Ausubel et al. ( supra .) Following transfection, 
transformation or infection, the cells are expanded and 
screened for the desired cell function. There are a variety 
of screening assays that are available to the investigator. 

25 Assay design should reflect the desired goal of mutagenesis. 

For example, the assay disclosed in Example III below is- 
designed to detect increased levels of expression of a 
particular antibody fragment in E. coll . Assays can also be 
designed to detect increases in the binding constants (K a ) of 

30 an antibody or receptor to its antigen or ligand. Other 
assays can be designed* to detect changes in the level of 
protein expression or changes in the functional activity of a 
protein. For example, in a eukaryotic system, the increased 
ability of a protein to promote growth or stimulate a 

35 particular cellular function can be measured by removing cell 
supernatants from mutated cells or their progeny, adding this 
supernatant to susceptible c lis, and assaying for growth 
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promoting activity. Those with skill in the art will be able 
to select an appropriate screening r selection assay for a 
particular library to identify a particular clone of interest. 

5 In a second example, EIPCR library mutagenesis can be 

used to alter the expression of one polypeptide in relation to 
a second polypeptide. Thus in Example III below, random 
mutagenesis is used to increase the level of Fv heavy chain 
expression, thereby equalizing levels of heavy and light chain 

10 Fv fragment expression. 

In general once a particular mutation is identified as 
conferring a desired property to a protein sequence, the cells 
are selected and expanded. The nucleic acid containing the 
desired mutation is isolated and sequenced. Identified 

15 sequences from mutations in regulatory regions of a nucleic 
acid sequence can then be genetically transposed to other 
expression systems. Thus, a contemplated method within the 
scope of this invention is one that identifies an optimized 
nucleic acid sequence derived from EIPCR library mutagenesis 

20 to promote an increase in the level of protein expression as 
compared with wildtype sequence. 

The following examples of random EIPCR library 
mutagenesis are provided below. These examples are intended 
to illustrate but not limit the invention. 



25 



30 



LE III 

This example illustrates a preferred embodiment of EIPCR 
library mutagenesis, wobble base mutagenesis. In wobble base 
mutagenesis, mutations are introduced into the nucleic acid 
sequence without altering the amino acid sequence of the 
target protein. In. this example, the leader or signal 
sequence of a protein is variably mutated in the third base 
position of at least one codon to generate • a recombinant 
library that can be screened for colonies with increased 
35 levels of eukaryotic protein expression as compared with non- 
mutated controls. The expression level of foreign proteins in 
E. coli is determined by a large number of factors, and 
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expression 1 v 1 optimization is normally a slow and tedious 
process. For secreted proteins, like the exemplary antibody 
Fv fragments used here, optimization of expression is 
complicated by the difficulties associated with secreting a 
5 eukaryotic protein in a prokaryotic system* Without the 
optimized modifications generated by EIPCR library 
mutagenesis, described below, secretion and expression of 
eukaryotic proteins in prokaryotic systems is very low. 

In this particular example, expression of Fv fragment 

10 expression of an anti-metal-chelate antibody (CHA255) was 
optimized in E. coli . The Fv fragment was expressed in active 
form in the periplasm of E. coli. Both the heavy and light 
chains of the Fv fragment, each with its own leader peptide, 
were placed under the control of a Lac promoter on a 1.8 kb 

15 plasmid. The CHA255 antibody binds a chelated radioactive 
metal ( 111 Indium or 9D Y chelate complex) to provide a simple 
screening assay to permit detection of functional antibody 
fragments. For optimization of expression or mutagenesis of 
other proteins and antibodies, other screening systems may be 

20 useful. 

Expression VectQCP 

Any expression vector that can be amplified together with 
its insert is contemplated within the scope of this invention. 
However, we have chosen to exemplify a relatively small 

25 plasmid (< 7kb) that is readily amplified by PCR. pMCHAFvl, 

the 1.8 kb expression vector used for EIPCR mutagenesis and Fv 
expression, is shown in Figure 5. The nucleic acid sequence 
encoding light chain of the Fv fragment is 5 1 to the nucleic 
acid sequence encoding the heavy chain of the Fv fragment. 

30 Each chain has its own OmpA signal peptide, and both chains 
are driven by a single Lac promoter. The OmpA signal sequence 
and Lac promoter sequence are r. vided in references from 
Mowa et al. and Reznikoff et ; respectively, which are 
hereby incorporated by reference (Mowa et al., J. Biol. Chem 

35 255:27-29, 1980, J. Mol. Biol. 143:317-328 (1980) and 

Reznikoff et al. (1980) "The Lac Promoter". The Operon . Miller 
et al. Eds. Cold Spring Harbor Press, NY.) The antibody genes 
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for CHA255 are the same as those used in Example I above. The 
codons of the light and heavy chain are those obtained from 
the original mouse antibody seguence. Similarly, the OmpA 
leader seguence is the native seguence obtained from the OmpA 
5 protein nucleic acid seguence as described in Example I. 
pMCHAFvl was constructed from pMINI3 (Figure 5) . pMINI3 is a 
1.0 kb expression vector which contains a synthetic Lac 
promoter, supF (derived from tRNA-tyr, Huang et al. , supra .) 
as the selectable marker, and a rop" ColEl origin, obtained 

10 from pUC (Pharmacia, Piscataway, N.J.) . The supF vectors are 
designed to be used with commercially available chemically r 
electro-competent E.coli MC1061/P3 cells (Invitrogen Inc., San 
Diego, CA) . These cells contain amber mutations in both the 
ampicillin and tetracycline drug resistance genes, located on 

15 a P3 incompatibility group plasmid. Thus the P3 plasmid can 
co-exist with ColEl incompatibility group plasmids such as 
pUC. The P3 plasmid is too large to interfere with pUC 
plasmid purification. Transformants are selected on plates 
with 25 ug/ml ampicillin and 7.5 ug/ml tetracycline. 



20 



n ^gQnucleni--idQ synthesis fo r Wobble Mutagenesis 

The two oligonucleotides used to construct the library 
are shown schematically in Figure 6B. The oligonucleotides 
are designed to hybridize to opposite DNA strands of the 
25 pMCHAFvl template adjacent to the OmpA leader seguence. The 
resulting DNA and mRNA derived from this pool of mutated 
oligonucleotides is a library of seguences, all encoding the 
same OmpA protein seguence. The X in Figure 6B corresponds to 
the variable positions within the primer population. The 
30 seguences are provided as SEQ ID NO: 26 and SEQ ID NO: 27. 

Here the - N corresponds • to the X in Figure 6B. Primer 
oligonucleotides also contain R and Y base designations. The 
r indicates the incorporation of a purine and the ¥ indicates 
the incorporation of a pyrimidine. The limitation of purines 
35 or pyrimidines in the third position of the codon ensures that 
the amino acid seguence is not modified by the incorporation 
of random nucleotides. Constant regions within the primer are 
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coded by th appropriate base designation. The primers 
(moving 5* to 3)' contain, as indicated, filler s guence, a 
Bsal class IIS restriction endonuclease recognition site, 
filler seguence, a Bsal cleavage site that forms the cohesive 
5 termini for circular ization, a region comprising random base 
positions in the third position of the nucleotide codon, and 
a complementary region to anchor the primer to the template 
during hybridization. Oligonucleotide synthesis was performed 
on a Milligen/Biosearch 8700 DNA synthesizer (Milligen, 

10 Burlington, MA). The mixed base positions were synthesized 
using a fresh 1:1:1:1 molar mixture of each of the four bases 
in the U reservoir. The oligonucleotides were made trityl-on 
and were purified with Nensorb Prep nucleic acid purification 
columns (NEN-Dupont, Boston, MA) as described by the 

15 manufacturer . 

Amplification conditions and genera t ion of modified template. 

PCR was performed in a 100 M l volume. Each reaction 
contained 0.5 fll of each purified primer, 0.5 ng pMCHAFvl 

20 template plasmid DNA, lx Tag buffer, 200 /iM of each dNTP and 

1 /il Tag polymerase (Perkin-Elmer-Cetus) The thermo-cycling 
parameters were: 94°C/3 min for 1 cycle; 94 e C/l min, 50°C/1 
min, 72°C/2 min for 3 cycles; 94°C/1 min, 55°C/1 min, 72°C/2 
min, with autoextension at 5 sec/cycle for 10 cycles; 94°C/1 

25 min, 55°C/1 min, 72°C/3 min, 1 with autoextension at 8 
sec/cycle for 12 cycles; followed by one 10 min cycle at 72°C. 
In a PCR reaction, the primers direct the amplification of a 
linear DNA seguence of egual length to the template plasmid 
with an additional 11-14 bp extensions at each end of the DNA 

30 that includes the class IIS restriction seguence. 

PCR Product Manipulations 

The DNA obtained from 2-4 100 A PCR reactions was 

flushed by addition of dNTPs to 200 fll, 50 units DNA 

35 Polym rase Klenow fragment and 30 units T4 DNA Kinase and 
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incubated at 37°C f r 30 minutes. After phenol/chloroform 
extraction and precipitati n f the DNA was digested with Bsal 
(New England Biolabs, Beverly MA) . The digested DNA was gel 
purified, ethanol precipitated, and ligated at low 
5 concentration and without polyethylene glycol to favor 
intramolecular interactions, thus favoring circularization of 
the nucleic acid as opposed to concatamer formation. The 
ligation was ethanol precipitated using ammonium acetate, 
washed twice with 80% ethanol, vacuum dried and resuspended in 
10 20 ul 0.1 x TE (Sambrook et al. , supra. ) for electroporation. 

After digesting the 12-14 bp overhang with Bsal, the resulting 
cohesive termini were ligated intramolecularly, and the 
ligation was electroporated into E. coli for expression 
analysis. 

15 

R1 Aetrroporation 

One microliter amounts of the ligation reaction wer 
electroporated into 20 ul of MC1061/P3 cells (Invitrogen, San 
Diego, CA) using the Invitrogen electroporator. Cells were 
20 plated on 23 x 23 cm plates as described above. 

call Growth Conditions 

For routine cell growth that does not require foreign 
protein expression, the cells were grown in M9CA media (Merril 
25 etal., Proc. Natl. Acad. Sci. (USA) 74:4335-4339, 1979) Which 

is hereby incorporated by reference. 

For colony lift screening assays, the cells were plated 
on 23x23 cm plates with CS agar (48 g/1 yeast extract, 24 g/1 
tryptone, 3 g/1 NaH2P04, 3 g/1 Na2HP04, 15 g/1 agar) with 0.5 
30 ug/ml isopropylthiogalactoside (IPTG) (Boehringer Mannheim, 
. Indianapolis, IN) for induction of protein expression. 

For expression level determination, clones were grown in 
CS broth with 0.2 mM IPTG in baffled shaker flasks at 250 rpm 
for 30 hours at 30 C, with a boost of 0.2 volumes of 240 g/1 
35 yeast extract and 120 g/1 tryptone after 18 hours. The Fv 
expressing constructs were grown at 30°C. CS broth permits 
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the use of higher 1 vels of IPTG before over-express i n of the 
foreign prot in caus s bacterial d ath. Thus, with CS broth 
most of the Fv protein can be found in the media rather than 
in the bacterial periplasm. 

5 

Sise Petej^inQtipn 0 f the frapdom Ljfrrary 

The molar ratios of fresh bases were reflected accurately 
in the oligonucleotide pool as determined by the methods f 
Hermes et al. (Proc. Natl. Acad. Sci. (USA) 87:696-700, 1990) 

10 which is hereby incorporated by reference. The ratio of bases 
in the mixed sites within the PCR product was verified by DNA 
sequencing a representative sampling of individual clones. 
The composition of the mixed site residues in the PCR product 
was 19% G, 31% A, 25% T, 25% C (n=119) . 

15 The theoretical maximum complexity of the library is 

8xl0 9 different sequences. The actual size of the library was 
determined by plating. In a typical electroporation, 5 x 10 5 
colony forming units (cfu) were obtained from electroporation 
of l fil of ligation mixture into 20 pi of cells. The ligation 

20 contained 0.5 fig of DNA in 20 pi. The library size is thus 

about 1 x 10 7 and the efficiency was 2 x 10 7 cfu/ug. For this 
particular example, the screening assay was found to be more 
limiting than library size. 

25 Colony Screening Assay 

Colony lifts of 23cmx23cm plates with 0.3-1 x 10 5 
colonies were prepared using BA83 nitrocellulose filters 
(Schleicher and Schuell, Keene, NH) . The filters were blocked 
by incubation in 3% non-fat milk in 25 mM Tris-HCI pH7.5 for 

30 10 minutes, washed with 25 mM Tris, followed by incubation in 
25 mM Tris containing 50 uCi of chelated 11l Indium or ^Yttrium 
per filter for l hour at room temperature. The filters were 
then washed with 25 mM Tris for a total of 15 minutes, dried 
and exposed to Kodak X-omat AR autoradiography film for 

35 several hours. 

Approximately 5 x 10 5 clones were screened, and a wide 
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range of signals were obtain d on the primary screen. 
Bacterial colonies that corresponded to strong f ilt r signals 
were purified by replating. These were again assayed for 
activity. Two colonies with very strong signals were colony 
purified and reassayed. The expression level of these two 
clones was about ten times that of the wildtype. Assay design 
for the expression of other antibody fragments in B. coli is 
outlined by Skerra et al. (Anal. Biochem. 196:151-155, 1991) 
which is hereby incorporated by reference. 

ttHmination i-h« ft ffer.t of unintended mutatjpns 

With any mutagenesis procedure there is a risk of 
introducing mutations in areas other than the target. To 
demonstrate that the observed increase in protein expression 
was the result of the nucleic acid seguence identified from 
the selected clone, a 130 bp fragment containing the mutated 
area was cloned back into wildtype pMCHAFvl DNA. This 
construct expressed more protein than the wildtype sequence, 
proving that the 10-fold increase in the level of protein 
expression as compared with wildtype controls is the result of 
the mutated seguence. 

nwa sequencing 

The seguence of the 130 bp fragment, containing the 
mutation that conferred increased protein expression was 
determined by double stranded dideoxy seguencing on a Dupont 
Genesis 2000 automated seguencer using the Dupont Genesis 2000 
seguencing kit. The DNA sequence of the 130 bp fragment 
differed from the wildtype sequence only at the targeted 
wobble bases, confirming that the amino acid sequence was not 
. altered by the mutagenesis procedure. No mutations outside of 
the targeted wobble bases were observed. The optimized 
sequences obtained by this method are provided in Figure 6C 
and listed as SEQ ID NO: 28 and SEQ ID NO: 29. These 
sequences can then be further defined to more specifically 
determine the expression promoting regions contained therein. 
Therefore SEQ ID NO: 28 and SEQ ID NO: 29 or fragm nts th reof 
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can be used in subs gu nt xpression systems to promote the 
expression of the same or different protein. 



Fv expression level quantitation 
5 The expression level of Fv fragments was determined by 

assaying cell free supernatants . Wildtype and purified mutant 
colonies were grown under expression conditions in CS broth as 
described above. Dilutions of antibody containing samples 
were incubated with radiolabeled metal-chelate. After 

10 incubation for one hour, the free, unbound metal chelate was 
separated from the antibody-bound metal chelate by 
centrifugation through a Millipore ultrafree filter (molecular 
wight cut-off of 10,000 MW, Millipore, Bedford, MA) Samples 
of the filtrated and the pre-f iltration mixture were counted 

15 for radioactivity, yielding a "fraction bound". A standard 
curve of "fraction bound' ersus known amounts of antibody was 
constructed. The amour. c of Fv in em unknown sample was 
determined from the standard curve. The results of the assay 
indicated that the mutants reproducibly expressed 10 times 

20 more active Fv fragment than the original construct. 

The protein sequence of the antibody fragments in this 
example is not altered by wobble base mutagenesis. Therefor 
any difference in signal strength in the screening assay is 
due to differences in expression levels. However, the 

25 expression level may be affected by the mutation in several 
ways. The mRNA stability could be improved by the mutation. 
Similarly, initiation and translation from the ribosome may be 
improved. Further, protein expression is strongly influenced 
by the sequence of the first few codons following the ATG 

30 initiation codon (Bucheler et al., Gene 98:271-276, 1990. 

Therefore, wobble base mutagenesis can potentially influence 
polypeptide expression in a number of ways depending on where 
the mutagenic primers bind to nucleic acid and which random 
mutations are conferred upon the sequence. 

35 



• EXAMPLE IV 

In another preferred embodiment of this invention, EIPCR 
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is used to create a promoter library for gene expression in E. 
coll. 

in this particular example of the preparation of a 
promoter library, Fv fragment expression of the anti-metal 
5 chelate antibody (CBA255) is optimized using a population of 
primers with variable sequences in the promoter region (Figure 

7). 

p^T-ession Vectors 

In this example, the plasmid used is pCCHAVll, a 2.4 Kb 

10 plasmid containing the Lac promoter followed by an OmpA leader 
sequence linked to the antibody light chain fragment sequence 
and followed by an optimized OmpA sequence linked to the 
antibody heavy chain fragment. Both antibody chain sequences 
are driven by a single Lac promoter. This optimized OmpA 

15 sequence (SEQ ID NO: 26) is derived from Example III. Plasmid 
pCCHAVll is Lad negative, chloramphenicaol resistance gene 
positive with a Rop" ColE 1 origin. In this example a 

second copy of the Lac promoter region is placed in front of 
the antibody heavy chain fragment sequence. The nucleic acid 

20 sequence is provided in Figure 7A (ID SEQ NO: 30) and the 
inserted promoter sequence is provided in Figure 7B and as ID 
SEQ NO: 33. The inserted region includes the Lac promoter 
library region followed by the wildtype Lac operator followed 
by the ribosome binding site. The sequence including the 

25 ribosome bindinq site is provided in ID SEQ NO: 34. 
m iaonuc] ootide Synthesis 

The primers used to create the recombinant promoter 
library are provided as ID SEQ NO: 31 and ID SEQ NO: 32. ID 
SEQ NO: 31 directed mutations to the ribosome binding site 

30 while ID SEQ NO: 32 directed changes to the Lac promoter 
region- In Figure 7B the ribosome binding site; the -10 and 
the -35 regions of the Lac promoter are underlined and the 
sequence is provided as ID SEQ NO: 34 and ID SEQ NO: 33 
respectively. The bold und rlining in Figure 7B c rresponds 

35 to the primer regions in ID SEQ NO: 31 and ID SEQ NO: 32 that 
are underlined. The underlined portions are those positions 
along the primer that contain variability. The expected 
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frequency of variability at each nucleotide position is 
deriv d from a mixture 75% of t mplat nucleotide and 8.3% for 
each of the remaining three nucleotides. For example, in ID 
SEQ NO: 31, the first underlined position is a cytosine. The 
5 expected bias of the primer population at this position is: 
75%:C, 8.3%:G, 8.3%:T, 8.3%:A. Libraries were created using 
primer populations based on ID SEQ NO: 31 and ID SEQ NO: 32. 
Other libraries were created using one biased primer 
population while the other member of the primer pair contained 

10 no variability. As an example, a recombinant library was 
created using ID SEQ NO: 31 to prepare a variable first prim r 
pool, while the second primer corresponded exactly with ID SEQ 
NO: 32 and therefore contained no variability. The library 
generated from these primers contains mutated sequences at the 

15 ribosome binding site and a constant Lac promoter sequence. 

The oligonucleotides comprise a Bsal restriction endonuclease 
recognition site, a region of variability reflected in the 
underlined portion of ID SEQ NO: 31 and ID SEQ NO: 32, and a 
region complementary to the template. 

20 

PQR Applif j.catiop and Proflqqt yar^pialatAQn 

Sequences were amplified using conditions outlined in 
Example III. Following amplification the nucleic acid was 
cleaved with Bsal and ligated. Nucleic acid was 
25 electroporated into E. coli. 

Colony Screening Assay and Identification of Positive Clones 
The screening assay is described in Example III. 
Colonies with increased levels of hapten binding are 
30 identified and colony purified. These colonies are expanded 
and analyzed for the. presence of unintended mutations. 
Optimized promoter sequences are identified by sequencing the 
expression plasmids from positive colonies. 

35 EXAMPLE V 

In yet another preferred embodiment of this invention, 
EIPCR is employed to create a eukaryotic mutagen sis library. 
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Similar to EIPCR in S^coli, any region of a eukaryotic vector 
can be modified. Eukaryotic expression vectors may be 
• modified in regulatory regions or within translated regions of 
a particular gene. In this example, a retroviral expressxon 
5 vector pLN is used to generate a library of mutations within 
the ribosome binding site of the Neomycin resistance gene. The 
ribosome binding site, also known as a Kozak sequence (Kozak, 
M. , Kuc. Acids. Res. 12(2):857-72, 1984 which is hereby 
incorporated by reference) is a highly conserved region in 
10 eukaryotic cells comprising the consensus sequence 
CCACCATG (G) . 

Fv pressi nn Vector 

The retroviral expression plasmid pLN was obtaxned from 

15 A.D. Miller and is described in a publication by Miller et al. 

(BioTechniques 7 (9) =980-990, 1989 which is hereby incorporated 
by reference). The vector contains two Moloney Murine 
Leukemia Virus (MoMuLV) long terminal repeats (LTR) . Between 
the LTR regions is the Neomycin resistance gene (Neo') . The 

20 Neo' ribosome binding site is targeted for library mutagenesis 
to confer increased resistance to G418 in the eukaryotic cell 
line NIH 3T3 (ATCC) . The plasmid has a final size of 6 kb. 

m i gonucl^nt-i de sy nthesis 

25 oligonucleotides are prepared that are similar in design 

to those described for Example I above. The primers are 
designed to flank the Neo' ribosome binding site and are 
substantially complementary to both strands of DMA. A short 
(4-10 bp.) variable region is designed to overlap the ribosome 

30 binding site. Thus, the oligonucleotides contain a class IIS 
recognition site, the variable region, and a twenty base 
complementary region that anchors the oligonucleotides to the 
pLN plasmid. 

35 am pn-ficat ^™ conditions 

Reaction tubes are prepared for PCR in a final 100 ul. 
reaction volume. Reaction conditions are optimized from 
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initial r action conditions as outlin d in Example III. 
Following PCR, the DKA is purif i d, cleaved with th desired 
class IIS restriction endonuclease, recircularized and 
ligated. 

5 

Ssolatipn Qt Jtepkqqefl V^or 

Ligated product from the PCR reaction is electroporated 
into the helper virus packaging cell line PE501 obtained from 
A.D. Miller and described by Miller et al., supra. Mutated pLN 
10 is transiently packaged into retroviral particles using PE501. 

Cell supernatant containing viral particles is harvested from 
the packaging cell line and titered on virus susceptible NIH 
3T3 cells (ATCC) . 

15 Selection and identification of Mu tated Sequences 

Colonies expressing mutations are selected with elevated 
levels of G418, preferably between 0.75 -2.5 mg/ml These 
colonies are expanded, lysed, and if desired, the DNA is 
purified. The optimized promoter region is retrieved from the 

20 selected cells by PCR. This new Kozak sequence can then be 
reintroduced into pLN to verify that the new sequence confers 
elevated G418 resistance. The region is sequenced to identify 
the selected nucleic acid sequence. The results from this work 
permits the identification of sequences conferring increased 

25 G418 resistance and facilitates the identification of Kozak 
sequence requirements and the isolation of improved sequences 
that can be transferred to other constructs to improve the 
expression of other protein sequences. 

It is additionally contemplated that this technology 

30 could be applied to any gene in combination with a selectable 
marker such as Neq*\ Therefore any gene or portion of a gene 
can be mutated and initially selected by its resistance to 
Neomycin. Subsequent selection will be required to 
distinguish the optimized mutation. Neomycin resistance is 

35 just one of a variety of selection systems useful for EIPCR 
library mutagenesis applications. For example , as a selection 
procedure, transfect d cells can b screened by a Fluorescent 
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Activated Cell Sorter (FACS) and positive colonies expanded 
from these cells for further analysis. 

Thus, EIPCR library mutagenesis is a reliable and 
efficient method for obtaining optimized nucleic acid 
5 sequences. EIPCR reactions have an efficiency of 95% or 
better in reactions designed to measure the efficiency of 
mutagenesis. EIPCR library mutagenesis is generally 
applicable for de novo design or redesign of protein or 
nucleic acid sequences. 
10 Although the invention has been described with reference 

to the above examples, it should be understood that various 
modifications can be made by those skilled in the art without 
departing from the invention. Accordingly, the invention is 
limited only by the following claims. 



15 
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SEQUENCE LISTING 



(1) GENERAL INFORMATIONS 

(i) APPLICANTS STEMMER, WILLEM 

(ii) TITLE OP INVENTION: ENZYMATIC INVERSE POLYMERASE CHAIN 
REACTION 

(iii) NUMBER OF SEQUENCES : 32 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : KNOBBE, MARTENS, OLSON & BEAR 

(B) STREET : 620 NEWPORT CENTER DRIVE, SIXTEENTH FLOOR 

(C) CITY: NEWPORT BEACH 

(D) STATE; CALIFORNIA 

(E) COUNTRY: UNITED STATES 

(F) ZIP: 92660 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION; 

(A) NAME: ISRAELSEN, NED A. 

(B) REGISTRATION NUMBER: 29,655 

(C) REFERENCE/DOCKET NUMBER: HYBRIT. 001CP1 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 619-235-8550 

(B) TELEFAX: 619-235-0189 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: circular 
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(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTIONS SEQ ID NO:l: 
5 AAATCTGGAG CCGGTGAGCG TGOGTCTCOC GGTATCATTG CAGCACTGGG GCCA 54 

(2) INFORMATION FOR SEQ ID NO:2: 

10 (i) SEQUENCE CHARACTERISTICS 5 

(A) LENGTHS 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 



15 



20 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2s 
ATTAGGTCTC GGTTCCCGCG GTATCATTGC AGCACT 

(2) INFORMATION FOR SEQ ID NO:3s 



(2} INFORMATION FOR SEQ ID NOs4: 



36 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTHS 35 base pairs 
25 (B) TYPEs nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NO:3s 

30 35 
AATTGGTCTC GGAACCACGC TCACCGGCTC CAGAT 



(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 54 base pairs 

(B) TYPEs nucleic acid 

(C) STRAND EDNESS : double 
4 q (D} TOPOLOGY s circular 

(ii) MOLECULE TYPE: CDNA 

45 (x i) SEQUENCE DESCRIPTIONS SEQ ID NOs4s 

• AAATCTGGAG CCGGTGAGCG TGGTTCCCGC GGTATCATTG CAGCACTGGG GCCA 

50 (2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucl ic acid 
55 (C) STRANDEDNESS: single 



54 
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(D) TOPOLOGY x linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
5 GGTCTCNNNN N 11 

(2) INFORMATION FOR SEQ ID NO: 6: 

10 (1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nuclaic acid 

(C) STRANDEDNE3S: single 

(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
20 GAAGACNNNN NN 12 

(2) INFORMATION FOR SEQ ID NO: 7: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 



30 



35 



45 



50 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 
CTCTTCNNNN 

(2) INFORMATION FOR SEQ ID NO: 8: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GAATGCN 

(2) INFORMATION FOR SEQ ID NO: 9: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 (D) TOPOLOGY: linear 



10 
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(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 9 J 
ACCTGCNNNN NNNN 

5 

(2) INFORMATION FOR SBQ ID NO:10s 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 10 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GGATCNNNNN 

(2) INFORMATION FOR SBQ ID NO:ll: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
25 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

17 



10 



GCAGCNNNNN NNNNNNN 



(2) INFORMATION FOR SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12r 

10 

GTCTCNNNNN 



45 (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
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ACTGGN 



(2) INFORMATION FOR SEQ ID NO: 14: 

5 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 
10 (D) topology: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGATGNNNNN NNNNNNNN 18 

15 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY; linear 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GAOGCNNNNN NNNNN 



30 (2) INFORMATION FOR SEQ ID NO:16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 
40 GGTGANNNNN NNN 



(2) INFORMATION FOR SEQ ID NO: 17: 

45 (i) SEQUENCE CHARACTERISTICS * 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



55 



(xi) SEQUENCE DESCRIPTI N: SEQ ID NO: 17: 
GAAGANNNNN NNN 



13 
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(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS : 
5 (A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 {x i) SEQUENCE DESCRIPTION : SEQ ID NO: 18: 

GAGTCNNNNN 

15 (2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 
20 (CJ STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 19: 
25 GCATCNNNNN NNNN 

(2) INFORMATION FOR SEQ ID NO: 20: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
CCTCNNNNNN N 

(2) INFORMATION FOR SEQ ID NO:21: 



35 



40 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 90 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: dpuble 

(D) TOPOLOGY: circular 



50 



10 



14 



11 



(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
AGGAACCAAA CTGACTGTCC TAGGATAGAA GGAGATATAT CATGAAAAAG ACAGCTGGCC 60 

90 

55 CAGGCCGAGG TGACCCTGGT GGAGTCTGGG 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 
5 (A) LENGTH: 58 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

ATTAGAAGAC TACTCCNNNN NNNNNNNNNN NNNNNNNGAG GTGACCCTGG TGGAGTCT 58 



15 (2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION:. SEQ ID NO: 23: 
25 AATTGAAGAC ATGGAGNNNN NNNNNNNNNN NNNNNNTCCT AGGACAGTCA GTTTGGTT- 58 



(2) INFORMATION FOR SEQ ID NO: 24: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 94 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

(D) TOPOLOGY: circular 

35 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 
40 (A) NAME /KEY: CDS 

(B) LOCATION: 2.- 94 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

A GGA ACC AAA CTG ACT GTC CTA GGA CGG AAA TCG GGG CGG TCT ACC 46 
Gly Thr Lys Leu Thr Val Leu Gly Arg Lys Ser Gly Arg Ser Thr 
15 10 15 

50 TCC CCT CTC CCA ATA AAA TTA GGG GAG GTG ACC CTG GTG GAG TCT GGG 94 

Ser Pro Leu Pro lie Lys Leu Gly Glu Val Thr Leu Val Glu Ser Gly 
20 25 30 



55 



(2) INFORMATION FOR SEQ ID NO: 25: 



WO 93/12257 



PCIYUS92/10647 



10 



15 



35 



-56- 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 31 amino acids 

(B) TYPEs amino acid 
(D) TOPOLOGY: Linear 

(li) MOLECULE TYPEs protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 25: 

Gly Thr Lys Leu Thr Val Leu Gly Arg Lys Ser Gly Arg Ser Thr Ser 
1 5 10 

Pro Leu Pro He Lys Leu Gly Glu Val Thr Leu Val Glu Ser Gly 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 72 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY: linear 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

TCTATAGGTC TCTTTGCNGT NGCNCTNGCN GGNTTYGCNA CNGTNGCNCA RGCNGAGGTG 
60 

30 ACCCTGGTGG AG 

72 



(2) INFORMATION FOR SEQ ID N0:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
40 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TATTAAGGTC TCAGCAATNG CRATNGCNGT YTTYTTCATG ATATATCTCC TTCTAT 
45 56 

(2) INFORMATION FOR SEQ ID NO: 28: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 bas pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS t double 

(D) TOPOLOGY: circular 

55 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

ATG AAA AAA ACC GCG ATC GCC ATT OCT OTO GOG CTT GCC 
39 

MET LYS LYS THR ALA ILB ALA ILB ALA VAL ALA LEU ALA 
1 5 10 

GGC TTT GCT ACG GTG GCG CAG GCA 
63 

GLY PHE ALA THR VAL ALA GLN ALA 
15 20 

(2) INFORMATION FOR SEQ ID NO: 29: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: circular 

25 (ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

30 ATG AAA AAA ACT GCA ATT GCG ATT GCT GTT GCT CTT GCT 

39 

MET LYS LYS THR ALA ILE ALA ILE ALA VAL ALA LEU ALA 
1 5 10 

35 GGT TTC GCG ACG GTA GCA CAG GCC 

63 

GLY PHE ALA THR VAL ALA GLN ALA 
15 20 



(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

AAGGAGATAT ATC 
13 
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(2) INFORMATI N FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS t 
(A J LENGTH: 85 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

AACTATTGGT CTCAGTGGAA TTGTGAGCGG ATAACAATTT CACAgAGGfiA ACAGCTAIGA 
60 

15 AAAAAACCGC GATCGCCATT GCTGT 

85 



20 



35 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 32: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 109 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D> TOPOLOGY x linear 

(xij SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

ATCATTAGGT CTCACGAC AG AACATACGAG CCCGAAGCAT AAAGTGTAAA GCCTGGGGTG 
30 60 

AAAAAAAAAG GCTCCAAAAG GAGCCTTTCT ATCCTAGGAC AGTCAGTTT 
109 



(2) INFORMATION FOR SEQ ID NO: 33: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 41 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
CACCCCAGGC TTTACACTTT ATGCTTCCGG CTCGTATGTT G 41 

(2) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 12 base pairs 
55 '(B) TYPE: nucleic acid 
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(C) STRANDEDNBSS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
CAGGAAACAG CT 12 
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We Claim: 

X. A method for generating a recombinant mutagenesis 
5 library by introducing one or more changes within a 

predetermined region of double stranded nucleic acid, 
comprising: 

(a) providing a first primer population and a 
second primer population, each said population having a 

X0 variable base composition at known positions along said 

primers, said primers incorporating a class IIS 
restriction enzyme recognition sequence, being capable 
of directing change in said nucleic acid sequence and 
being substantially complementary to said double- 

15 stranded nucleic acid to allow hybridization thereto; 

(b) hybridizing said first and second primer 
populations to opposite strands of said double stranded 
nucleic acid to form a first pair of primer-templates 
oriented in opposite directions; 

20 (c) performing enzymatic inverse polymerase chain 

reaction to generate at least one linear copy of said 
double stranded nucleic acid incorporating said change 
directed by said primers; 

(d) cutting the double stranded nucleic acid copy 
25 of step (c) with a class IIS restriction enzyme to form 

a restricted linear nucleic acid molecule containing 

said change; and 

(e) introducing nucleic acid generated from step 
(c) or (d) into compatible host cells. 

2. The method of Claim 1, additionally comprising the 
step of joining termini of said restricted linear nucleic 

' acid molecule of step (d) to produce double-stranded 
circular nucleic acid. 

3. The method of Claim 1, wherein said restricted 
35 linear nucleic acid molecule produced in st p (d) contains 

only said change in said nucleic acid sequence. 

4. The method of Claim 1, wherein at least steps (b) 



30 
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and (c) ar repeated on r more times, 

5. The method of Claim 1, wherein said double- 
stranded nucleic acid is circular DMA. 

6. The method of Claim 1, wherein step (d) further 
5 comprises treating said restricted nucleic acid molecule 

with a polymerase under conditions which create blunt ends. 

7. The method of Claim 1, wherein said host cells are 
bacteria. 

8. The method of Claim 1 wherein said double stranded 
10 nucleic acid encodes polypeptide. 

9. The method of Claim 8, additionally comprising the 
step of expressing said polypeptide encoded by the nucleic 
acid of step (e) . 

10. The method of Claim 1, wherein said cells are 
15 eukaryotic. 

11. The method of Claim 8 wherein said change is 
located within a polypeptide encoding region of the double- 
stranded nucleic acid. 

12. The method of Claim 8, wherein said change is 

20 located within a regulatory region of said double-stranded 

nucleic acid. 

13. The method of Claim 12, wherein said change is 
located within a promoter region of said double-stranded 
nucleic acid. 

25 14. The method of Claim 8, wherein said change is 

located within the enhancer region of said double-stranded 
nucleic acid. 

15. The method of Claim 1, wherein said double 
stranded nucleic acid comprises a viral vector . 

30 16. The method of Claim 15, wherein said compatible 

host cells comprise a helper virus packaging cell line that 
directs the packaging of viral particles containing said 
viral vector. 

17. The method of Claim 16, comprising the step of 
35 collecting said viral particl s. 

18. The method of Claim 17, additionally comprising 
the step of infecting susceptible cells with said viral 
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particles . 

19. A recombinant library creat d by the method of 
Claim 1* 

20. A method for improving polypeptide expression from 
5 a double-stranded nucleic acid sequence encoding polypeptide 

comprising? 

(a) measuring polypeptide expression from saxd 
double-stranded nucleic acid in a compatible host cell 

(b) providing a first primer population and a 

10 second primer population, each said population having a 

variable base composition at known positions along saxd 
primers, said primers incorporating a class IIS 
restriction enzyme recognition sequence, being capable 
of directing change in said nucleic acid sequence and 

15 being substantially complementary to said double- 

stranded nucleic acid to allow hybridization thereto; 

(c) hybridizing said first and second primer 
populations to opposite strands of said double stranded 
nucleic acid to form a first pair of primer-templates 

20 orientated in opposite directions; 

(d) performing enzymatic inverse polymerase chain 
reaction to generate at least one linear copy of said 
double stranded nucleic acid incorporating said change 
directed by said primers; 

25 ( e) cutting said double stranded nucleic acid 

copy of step (d) with a class IIS restriction enzyme to 
form a restricted linear nucleic acid molecule 
containing said change; 

(f) introducing said nucleic acid generated from 

30 step (d) or (e) into said host cells; 

(g) measuring polypeptide expression from said 
• ' modified nucleic acid of step (f) in said cells; and 

(h) identifying cells with expression levels 
greater than the expression levels measured in step 

35 21!' The method of Claim 20, additionally comprising 

the step of joining termini of said restricted linear 
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nucleic acid of step ( ) to produce modified double-stranded 
circular nucleic acid. 

22. The method of Claim 20 , additionally comprising 
the step of obtaining modiried template from said Identified 

5 cells. 

23. The method of Claim 22 # comprising the step of 
identifying the modified nucleic acid sequence. 

24. The method of Claim 22, comprising transferring 
the modified sequence into another nucleic acid s.equence. 

10 25. The method of Claim 21, wherein said primers 

direct changes in a promoter sequence. 

26. The method of Claim 21, wherein said primers 
direct changes in a polypeptide sequence. 

27. The method of Claim 21, wherein said compatible 
15 cells are bacteria. 

28. The method of Claim 21, wherein said cells are 
eukaryotes . 

29. The method of Claim 21, wherein said primers 
direct changes in a ribosome binding sequence. 

20 30. A method for generating a recombinant library 

using wobble-base mutagenesis comprising: 

(a) providing a first primer population and a 
second primer population, said primers being 
substantially complementary to a region of double 

25 stranded nucleic acid encoding polypeptide to allow 

hybridization thereto, said primers having a variable 
base composition in the third position of at least one 
nucleotide codon corresponding to said double stranded 
nucleic acid and a class IIS restriction enzyme 

30 recognition sequence; 

(b) hybridizing said first and second primer 
populations to opposite strands of said double stranded 
nucleic acid to form a first pair of primer-templates 
orientated in opposite directions; 

35 (c) performing enzymatic inverse polymerase chain 

reaction to generate at least one linear copy of said 
double stranded nucleic acid incorporating said change 
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directed by said primers; 

(d) cutting said double stranded linear nucl ic 
acid of step (c) with a class IIS restriction enzyme to 
form restricted linear nucleic acid molecule containing 

said change; and 

(e) introducing nucleic acid generated from step 

(c) or (d) into compatible host cells. 

31. The method of Claim 30, additionally comprising 
joining termini of said restricted linear nucleic acid of 
step (d) to produce double-stranded circular nucleic acid. 

32. The method of Claim 30, wherein said variable base 
codons do not alter the corresponding amino acid sequence of 
said polypeptide. 

33. The method of Claim 30, wherein said primers 
15 direct alterations in the leader sequence of said 

polypeptide. 

34. The method of Claim 30, wherein said host cells 

are bacteria. 

35. The method of Claim 33, wherein said leader 

20 sequence is the bacterial OmpA protein leader sequence or a 

fragment thereof. 

36. The method of Claim 33, wherein said leader 
sequence is linked to polynucleotide encoding liqht and 
heavy chain antibody fragments. 

25 37. An optimized OmpA protein leader: 

5 'ATGAAAAAAACTGCAATTGCGATTGCTGTTGCTCTTGCTGGTTTCGCGACGGTAGCAC 

AGGCC 3', or an expression promoting fragment thereof. 

38. An optimized OmpA protein leader sequence: 
5 'ATGAAAAAAACCGCGATCGCCATTGCTGTGGCGCTTGCCGGCTTTGCTACGGTGGCGC 

AGG 3* or an expression promotinq fragment thereof. 



30 
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PCR Template: 
uncut plasmid, 
maximum about 4 kb. 



•PCR: 

- denature 

- anneal primers 
-extend with 

Vent polymerase 




Mutation(6) 



mutation(s) 



Primer A 81)81 



GAAGACNNNNNN 



Bbs1 

4= 



• fill in ends with Klenow 
•cut with Bbs1 

• ligate 

• transform 

• sequence 



NNNNNN0V9W9 



Primer B 



mutation(s) 
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Figure 2 



B 



Wildtype Template 

T C 

AAATCTGGAGCCGGTGAGCGTGG^TCTCGCGGTATCATTGCAGCACTGGGGCCA 
TTTAGACCTCGGCCACTCGCACCCAGAGCGCCATAGTAACGTCGTGACCCCGGT 



•Design 
primers 



AAATCTGGAGCCGGTGAC 
T 

primer B 

•PCR 

•Cut with Bsa1 
•Ligate, transform 
•Sequence 





primer A 

WMWmWWKmWmGGGGCCA 

CCATAGTAACGTCGTGACCCCGGT 



Desired Mutations 

AAATCTGGAGCCGGTGAGCGTG^Afflra^^M^i^ffi^^aSGGGCCA 
TT^P^^MMm^MmGGCGCCATAGTAACGTCGTGACCCCGGT 



X 



SUBSTITUTE SHEET 
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FIGURE 3 

CLASS 2S RESTRICTION ENZYMES SUITABLE FOR USE WITH E1PCR 



Bsa1 GGTCTCNk^lls!, 

Bbsl GAAGACNNbii^ 

Earl CTCTTCNblM^ 

Bsm1 GAATGpfcll 

BspM1 ACCTGCNNNNkllMJ, 

Alw1 ggatcnnnn!^ 

BsmA1 GTCTCNfelNNJ^ 

Bsr1 ACTG^N 

Fok1 GGATGNNNNNNNNfs^U^ 

Hga1 GACGCNNNNNj^JNN^I 

Hph1 GGTGANNNNNNNJfcjl 

Mbo2 GAAGANNNNNNNNl 

Ple1 GAGTCNNnM^ 

SfaNl GCATCNNNNlste!^ 

Mnl1 CCTCNNNNNNNj . 
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